Supercharged proteins for cell penetration

ABSTRACT

Compositions, preparations, systems, and related methods for delivering a supercharged protein, or a complex of a supercharged protein and an agent (e.g., nucleic acids, peptides, proteins, small molecules) to cells are provided. Such systems and methods include the use of supercharged proteins. For example, superpositively charged proteins may be associated with nucleic acids (which typically have a net negative charge) via electrostatic interactions. In some embodiments, such systems and methods involve altering the primary sequence of a protein in order to “supercharge” the protein (e.g., to generate a superpositively-charged protein). In some embodiments, complexes comprising supercharged proteins and one or more agents to be delivered are useful as therapeutic agents. In some embodiments, complexes and/or pharmaceutical compositions thereof are administered to a subject in need thereof. The inventive complexes or pharmaceutical compositions thereof may be used to treat proliferative diseases, infectious diseases, cardiovascular diseases, inborn errors in metabolism, genetic diseases, etc.

GOVERNMENT SUPPORT

This invention was made with U.S. Government support under contractnumber R01 GM 065400 awarded by the National Institutes of Health/NIGMS.The U.S. Government has certain rights in the invention.

BACKGROUND OF THE INVENTION

The effectiveness of an agent intended for use as a therapeutic,diagnostic, or other application is often highly dependent on itsability to penetrate cellular membranes or tissue to induce a desiredchange in biological activity. Although many therapeutic drugs,diagnostic or other product candidates, whether protein, nucleic acid,organic small molecule, or inorganic small molecule, show promisingbiological activity in vitro, many fail to reach or penetrate targetcells to achieve the desired effect, often due to physiochemicalproperties that result in inadequate biodistribution in vivo.

In particular, nucleic acids have great potential as effectivetherapeutic agents and as research tools. The generality andsequence-specificity of siRNA-mediated gene regulation has raised thepossibility of using siRNAs as gene-specific therapeutic agents (Bumcrotet al., 2006, Nat. Chem. Biol., 2:711-19; incorporated herein byreference). The suppression of gene expression by short interfering RNA(siRNA) has also emerged as a valuable tool for studying gene andprotein function (Dorsett et al., 2004, Nat. Rev. Drug Discov.,3:318-29; Dykxhoorn et al., 2003, Nat. Rev. Mol. Cell. Biol., 4:457-67;Elbashir et al., 2001, Nature, 411:494-98; each of which is incorporatedherein by reference). However, the delivery of nucleic acids such assiRNAs to cells has been found to be unpredictable and is typicallyinefficient. One obstacle to effective delivery of nucleic acids tocells is inducing cells to take up the nucleic acid. Much work has beendone to identify agents that can aid in the delivery of nucleic acids tocells. Commercially available cationic lipid reagents are typically usedto transfect siRNA in cell culture. The effectiveness of cationiclipid-based siRNA delivery, however, varies greatly by cell type. Also,a number of cell lines including some primary neuron, T-cell,fibroblast, and epithelial cell lines have demonstrated resistance tocommon cationic lipid transfection techniques (Carlotti et at, 2004,Mol. Ther., 9:209-17; Ma et al., 2002, Neuroscience, 112:1-5; McManus etal., 2002, J. Immunol., 169:5754-60; Strait et al., 2007, Am. J.Physiol. Renal Physiol., 293:F601-06; each of which is incorporatedherein by reference). Alternative transfection approaches includingelectroporation (Jantsch et al., 2008, J. Immunol. Methods, 337:71-77;incorporated herein by reference) and virus-mediated siRNA delivery(Brummelkamp et al., 2002, Cancer Cell, 2:243-47; Stewart et al., 2003,RNA, 9:493-501; each of which is incorporated herein by reference) havealso been used; however, these methods can be cytotoxic or perturbcellular function in unpredictable ways and have limited value for thedelivery of nucleic acids (e.g., siRNA) as therapeutic agents in asubject.

Recent efforts to address the challenges of nucleic acid delivery haveresulted in a variety of new nucleic acid delivery platforms. Thesemethods include lipidoids (Akinc et al., 2008, Nat. Biotechnol.,26:561-69; incorporated herein by reference), cationic polymers (Seguraand Hubbell, 2007, Bioconjug. Chem., 18:736-45; incorporated herein byreference), inorganic nanoparticles (Sokolova and Epple, Angew Chem.Int. Ed. Engl., 47:1382-95; incorporated herein by reference), carbonnanotubes (Liu et al., 2007, Angew Chem. Int. Ed. Engl., 46:2023-27;incorporated herein by reference), cell-penetrating peptides (Deshayeset al., 2005, Cell Mol. Life. Sci., 62:1839-49; and Meade and Dowdy,2008, Adv. Drug Deliv. Rev., 60: 530-36; both of which are incorporatedherein by reference), and chemically modified siRNA (Krutzfeldt et al.,2005, Nature 438: 685-89; incorporated herein by reference). Each ofthese delivery systems offers benefits for particular applications; inmost cases, however, questions regarding cytotoxicity, ease ofpreparation, stability, or generality remain. Easily prepared reagentscapable of effectively delivering nucleic acids (e.g., siRNA) to avariety of cell lines without significant cytotoxicity therefore remainof considerable interest.

Given the current interest in RNAi therapies and other nucleicacid-based therapies, there remains a need in the art for reagents andsystems that can be used to deliver nucleic acids as well as otheragents (e.g., peptides, proteins, small molecules) to a wide variety ofcell types predictably and efficiently.

Similarly, the inability of most proteins to spontaneously entermammalian cells limits their usefulness as research tools and theirpotential as therapeutic agents. Proteins have demonstrated greatpotential as research tools (including hormones, cytokines, andantibodies) and as human therapeutics (including erythropoietin,insulin, and interferons). Due to the inability of most proteins tospontaneously enter cells, however, exogenous proteins are largelyrestricted to interacting with extracellular targets. Over the pastdecade, techniques for the delivery of proteins into mammalian cellshave been developed to address intracellular targets. These techniquesinclude lipid-based reagents (Zelphati et al., J. Biol. Chem. 276,35103-35110, 2001), nanoparticles (Hasadsri et al., J. Biol. Chem.,2009), vault ribonucleoprotein particles (Lai et al., ACS Nano 3,691-699, 2009), and genetic or chemical fusion to receptor ligands(Gabel et al., J. Cell Biol. 103, 1817-1827, 1986; Rizk et al., Proc.Natl. Acad. Sci. U.S.A. 106, 11011-11015, 2009) or cell-penetratingpeptides (Wadia et al., Curr. Protein Pept. Sci. 4, 97-104, 2003; Zhouet al., Cell Stem Cell 4, 381-384, 2009). Perhaps the most common methodfor protein delivery is genetic fusion to protein transduction domains(PTDs) including the HIV-1 transactivator of transcription (Tat) peptideand polyarginine peptides. These cationic PTDs promote association withnegatively charged cell-surface structures and subsequent endocytosis ofexogenous proteins. Both Tat and polyarginine have been used to delivera variety of macromolecules into cells both in vitro and in vivo (Wadiaet al., Curr. Protein Pept. Sci. 4, 97-104, 2003; Zhou et al., Cell StemCell 4, 381-384, 2009; Myou et al., J. Immunol. 169, 2670-2676, 2002;Bae et al., Clin. Exp. Immunol. 157, 128-138, 2009; Schwarze et al.,Science 285, 1569-1572, 1999). Despite these advances, intracellulartargets in many cases remain difficult to perturb using exogenousproteins; even modest success can require high concentrations ofexogenous protein due to the low efficiency with which proteins arefunctionally delivered into cells (Zhou et al., Cell Stem Cell 4,381-384, 2009; Wang et al., Nat. Biotechnol. 26, 901-908, 2008).

SUMMARY OF THE INVENTION

The present invention provides novel systems, compositions,preparations, and related methods for delivering nucleic acids and otheragents (e.g., peptides, proteins, small molecules) into cells using aprotein that has been modified to result in an increase or decrease inthe overall surface charge on the protein, referred to henceforth as“supercharging.” Thus, supercharging can be used to promote the entryinto a cell in vivo or in vitro of a supercharged protein, or agent(s)associated with the supercharged protein that together form a complex.Such systems and methods may comprise the use of proteins that have beenengineered to be supercharged and include all such modifications,including but not limited to, those involving changes in amino acidsequence as well as the attachment of charged moieties to the protein.Examples of engineered supercharged proteins are described ininternational PCT patent application, PCT/US07/70254, filed Jun. 1,2007, published as WO 2007/143574 on Dec. 13, 2007; and in U.S.provisional patent applications, U.S. Ser. No. 60/810,364, filed Jun. 2,2006, and U.S. Ser. No. 60/836,607, filed Aug. 9, 2006; each of which isentitled “Protein Surface Remodeling,” and each of which is incorporatedherein by reference. Further examples of supercharged proteins useful indrug delivery are also described herein. The present invention alsocontemplates the use of naturally occurring supercharged proteins toenhance cell penetration of associated agents that together form acomplex or to enhance the cell penetration of the naturally occurringsupercharged protein itself. Typically, the supercharged protein,engineered or naturally occurring, is positively charged. In certainembodiments, superpositively charged proteins may be associated withnucleic acids (which typically have a net negative charge) viaelectrostatic interactions, thereby aiding in the delivery of thenucleic acid to a cell. Superpositively charged proteins may also beassociated covalently or non-covalently with the nucleic acid to bedelivered in other ways. Other agents such as peptides or smallmolecules may also be delivered to cells using supercharged proteinsthat are covalently bound or otherwise associated (e.g., electrostaticinteractions) with the agent to be delivered. In certain embodiments,the supercharged protein is fused with a second protein sequence. Forexample, in certain embodiments, the agent to be delivered and thesuperpositively charged protein are expressed together in a singlepolypeptide chain as a fusion protein. In certain embodiments, thefusion protein has a linker, e.g., a cleavable linker between thesupercharged protein and the other protein component. In certainembodiments, the agent to be delivered and the supercharged protein,e.g., a superpositively charged protein, are associated with each othervia a cleavable linker (e.g., a linker cleavable by a protease oresterase, disulfide bond). The supercharged protein, e.g., asuperpositively charged protein, useful in the present invention istypically non-antigenic, biodegradable, and/or biocompatible. In certainembodiments, the superpositively charged protein does not havebiological activity or any deleterious biological activity. In certainembodiments the supercharged protein has a mutation or other alteration(e.g., a post-translational modification such as a cleavage or othercovalent modification) which decreases or abolishes a biologicalactivity exhibited by the protein prior to supercharging. This may be ofparticular interest when the supercharged protein is of interest notbecause of its own biological activity but for use in delivering anagent to a cell. Without wishing to be bound by a particular theory,anionic cell-surface proteoglycans are thought to serve as a receptorfor the actin-dependent endocytosis of the superpositively chargedprotein bound to its payload. The inventive supercharged proteins ordelivery system using supercharged, e.g., superpositively chargedproteins, may include the use of other pharmaceutically acceptableexcipients such as polymers, lipids, carbohydrates, small molecules,targeting moieties, endosomolytic agents, proteins, peptides, etc. Forexample, a supercharged protein or complex of a supercharged protein,e.g., a superpositively charged protein, and agent to be delivered maybe contained within or be associated with a microparticle, nanoparticle,picoparticle, micelle, liposome, or other drug delivery system. In otherembodiments, only the agent to be delivered and the supercharged proteinare used to deliver the agent to a cell. In certain embodiments, thesupercharged protein is chosen to deliver itself or an associated agentto a particular cell or tissue type. In certain embodiments, thesupercharged, e.g., superpositively charged, protein or agent to bedelivered and the supercharged protein are combined with an agent thatdisrupts endosomolytic vesicles or enhances the degradation of endosomes(e.g., chloroquine, pyrene butyric acid, fusogenic peptides,polyethyleneimine, hemagglutinin 2 (HA2) peptide, melittin peptide).Thus, escape of the agent to be delivered from the endosome into thecytosol is enhanced.

In some embodiments, the inventive systems and methods involve alteringthe primary sequence of a protein in order to “supercharge” the protein.In other embodiments, the inventive systems and methods involve theattachment of charged moieties to the protein in order to “supercharge”the protein. That is, the overall net charge on the modified protein isincreased (either more positive charge or more negative charge) comparedto the unmodified protein. In certain embodiments, the protein issupercharged, e.g., superpositively charged, to enable the delivery ofnucleic acids or other agents to a cell. Any protein may be“supercharged”. Typically, the protein is non-immunogenic and eithernaturally or upon supercharging has the ability to transfect or deliveritself or an associated agent into a cell. In certain embodiments, theactivity of the supercharged protein is approximately or substantiallythe same as the protein without modification. In other embodiments, theactivity of the supercharged protein is substantially decreased ascompared to the protein without modification. Such activity may not berelevant to the delivery of itself or an associated agent, e.g., nucleicacids, to cells as described herein. In some embodiments, supercharginga protein results in increasing the protein's resistance to aggregation,solubility, ability to refold, and/or general stability under a widerange of conditions as well as increasing the protein's ability todeliver itself or an associated agent, e.g., nucleic acids, to a cell.In certain embodiments, the supercharged protein helps to target itselfor an associated agent to be delivered to a particular cell type,tissue, or organ. In certain embodiments, supercharging a proteinincludes the steps of: (a) identifying surface residues of a protein ofinterest; (b) optionally, identifying the particular surface residuesthat are not highly conserved among other proteins related to theprotein of interest (i.e., determining which amino acids are notessential for the activity or function of the protein); (c) determiningthe hydrophilicity of the identified surface residues; and (d) replacingan one or more of the identified charged or polar, solvent-exposedresidues with an amino acid that is charged at physiological pH. Seepublished international PCT patent application, PCT/US07/70254, filedJun. 1, 2007, published as WO 2007/143574 on Dec. 13, 2007; and U.S.Provisional patent applications, U.S. Ser. No. 60/810,364, filed Jun. 2,2006, and U.S. Ser. No. 60/836,607, filed Aug. 9, 2006; each of which isentitled “Protein Surface Remodeling”; and each of which is incorporatedherein by reference. Exemplary methods of preparing superchargedproteins and exemplary protein sequences illustrating the use of methodare described herein. In certain embodiments, to make a positivelycharged “supercharged” protein, the residues identified for modificationare mutated either to lysine (Lys) or arginine (Arg) residues (i.e.,amino acids that are positively charged at physiological pH). In certainembodiments, to make a negatively charged “supercharged” protein, theresidues identified for modification are mutated either to aspartate(Asp) or glutamate (Glu) residues (i.e., amino acids that are negativelycharged at physiological pH). Each of the above steps may be carried outusing any technique, computer software, algorithm, methodology,paradigm, etc. known in the art. After the modified protein is created,it may be tested for its activity and/or the desired property beingsought (e.g., the ability to delivery a nucleic acid or other agent intoa cell). In certain embodiments, the supercharged protein is lesssusceptible to aggregation. In certain embodiments, a positively charged“supercharged” protein (e.g., superpositively charged green fluorescentprotein (GFP) such +36 GFP) is useful in delivering a nucleic acid(e.g., an siRNA agent) to a cell (e.g., a mammalian cell, a human cell).In certain embodiments, the inventive system allows for the delivery ofnucleic acids into cells normally resistant to transfection (e.g.,neuronal cells, T-cells, fibroblasts, and epithelial cells). In certainembodiments, rather than engineering a supercharged protein, a naturallyoccurring supercharged protein is identified and used in the inventivedrug delivery system. Examples of naturally occurring superchargedproteins include, but are not limited to, cyclon (ID No.: Q9H6F5), PNRC1(ID No.: Q12796), RNPS1 (ID No.: Q15287), SURF6 (ID No.: O75683), AR6P(ID No.: Q66PJ3), NKAP (ID No.: Q8N5F7), EBP2 (ID No.: Q99848), LSM11(ID No.: P83369), RL4 (ID No.: P36578), KRR1 (ID No.: Q13601), RY-1 (IDNo.: Q8WVK2), BriX (ID No.: Q8TDN6), MNDA (ID No.: P41218), H1b (ID No.:P16401), cyclin (ID No.: Q9UK58), MDK (ID No.: P21741), Midkine (ID No.:P21741), PROK (ID No.: Q9HC23), FGF5 (ID No.: P12034), SFRS (ID No.:Q8N9Q2), AKIP (ID No.: Q9NWT8), CDK (ID No.: Q8N726), beta-defensin (IDNo.: P81534), Defensin 3 (ID No.: P81534); PAVAC (ID No.: P18509), PACAP(ID No.: P18509), eotaxin-3 (ID No.: Q9Y258), histone H2A (ID No.:Q7L7L0), HMGB1 (ID No.: P09429), C-Jun (ID No.: P05412), TERF 1 (ID No.:P54274), N-DEK (ID No.: P35659), PIAS 1 (ID No.: O75925), Ku70 (ID No.:P12956), HBEGF (ID No.: Q99075), and HGF (ID No.: P14210), HRX (ID No.:Q03164), histone 4 (ID No.: P62805).

In certain embodiments, once a supercharged protein has been obtained,systems and methods in accordance with the invention involve associatingone or more nucleic acids or other agents with the supercharged proteinand contacting the resulting complex with a cell under suitableconditions for the cell to take up the payload. The nucleic acid may bea DNA, RNA, and/or hybrid or derivative thereof. In certain embodiments,the nucleic acid is an RNAi agent, RNAi-inducing agent, shortinterfering RNA (siRNA), short hairpin RNA (shRNA), micro RNA (miRNA),antisense RNA, ribozyme, catalytic DNA, RNA that induces triple helixformation, aptamer, vector, plasmid, viral genome, artificialchromosome, etc. In some embodiments, the nucleic acid issingle-stranded. In other embodiments, the nucleic acid isdouble-stranded. In some embodiments, a nucleic acid may comprise one ormore detectable labels (e.g., fluorescent tags and/or radioactiveatoms). In certain embodiments, the nucleic acid is modified orderivatized (e.g., to be less susceptible to degradation, to improvetransfection efficiency). In certain embodiments, the modification ofthe nucleic acid prevents the degradation of the nucleic acid. Incertain embodiments, the modification of the nucleic acid aids in thedelivery of the nucleic acid to a cell. Other agents that may bedelivered using a supercharged protein include small molecules,peptides, and proteins. The resulting complex may then be combined orassociated with other pharmaceutically acceptable excipient(s) to form acomposition suitable for delivering the agent to a cell, tissue, organ,or subject.

Supercharged proteins may be associated with nucleic acids (or otheragents) via non-covalent interactions to form a complex. Althoughcovalent association of the supercharged protein with a nucleic acid ispossible, it is typically not necessary to achieve delivery of thenucleic acid. In some embodiments, supercharged proteins are associatedwith nucleic acids via electrostatic interactions. Supercharged proteinsmay be associated with nucleic acids through other non-covalentinteractions or covalent interactions. The supercharged proteins mayhave a net positive charge of at least +5, +10, +15, +20, +25, +30, +35,+40, or +50. In some embodiments, superpositively charged proteins areassociated with nucleic acids that have an overall net negative charge.The resulting complex may have a net negative or positive charge. Incertain embodiments, the complex has a net positive charge. For example,+36 GFP may be associated with a negatively charged siRNA.

Supercharged proteins may be associated with other agents besidesnucleic acids via non-covalent or covalent interactions. For example, anegatively charged protein may be associated with a superpositivelycharged protein through electrostatic interactions. For agents that arenot charged or do not have sufficient charge, the agent may becovalently associated with the supercharged protein to effect deliveryof the agent to a cell. For example, a peptide therapeutic may be fusedto the supercharged protein in order to deliver the peptide therapeuticto a cell. In certain embodiments, the supercharged protein and thepeptide may be joined via a cleavable linker. To give but anotherexample, a small molecule may be conjugated to a supercharged proteinfor delivery to a cell. The agent may also be associated with thesupercharged protein through non-covalent interactions (e.g.,ligand-receptor interaction, dipole-dipole interaction, etc.).

The present invention provides complexes comprising superchargedproteins and one or more molecules of the agent to be delivered. In someembodiments, such complexes comprise multiple agent molecules persupercharged protein molecule. In some embodiments, such complexescomprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, or more agent (e.g., nucleicacids) molecules per supercharged protein molecule. In certainparticular embodiments, a complex comprises approximately 1-2 nucleicacid molecules (e.g., siRNA) to approximately 1 supercharged proteinmolecule. In other embodiments, such complexes comprise multiple proteinmolecules per agent molecule. In some embodiments, such complexescomprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, or more protein moleculesper agent molecule. In certain embodiments, such complexes compriseapproximately one agent molecule and approximately one superpositivelycharged protein molecule. In certain embodiments, the overall net chargeon the agent/supercharged protein complex is negative. In certainembodiments, the overall net charge on the agent/supercharged proteincomplex is positive. In certain embodiments, the overall net charge onthe agent/supercharged protein complex is neutral. In certain particularembodiments, the overall net charge on the nucleic acid/superchargedprotein complex is positive.

In another aspect, the present invention provides pharmaceuticalcompositions comprising: a) one or more supercharged proteins; b) one ormore complexes of supercharged protein and an agent to be delivered; orc) one or more of a) or one or more of b), in accordance with theinvention and at least one pharmaceutically acceptable excipient. Theamount of the complex in the composition may be the amount useful toinduce a desired biological response in the cell, for example, increaseor decrease the expression of a particular gene in the cell. In certainembodiments, the complex is associated with a targeting moiety (e.g.,small molecule, protein, peptide, carbohydrate, etc.) used to direct thedelivery of the agent to a particular cell, type of cell, tissue, ororgan.

In some embodiments, a supercharged protein or complexes comprisingsupercharged proteins, engineered or naturally occurring, and one ormore nucleic acids (and/or pharmaceutical compositions thereof) areuseful as therapeutic agents. In some embodiments, a nucleic acid and/orsupercharged protein may be therapeutically active. In certainembodiments, the nucleic acid is therapeutically active. For example,some conditions (e.g., cancer, inflammatory diseases) are associatedwith the expression of certain mRNAs and/or proteins. Superchargedproteins associated with RNAi agents targeting an expressed mRNA may beuseful for treating such conditions. Alternatively, some conditions areassociated with underexpression of certain mRNAs and/or proteins (e.g.,cancer, inborn errors in metabolism). Supercharged proteins associatedwith vectors that drive expression of the deficient mRNA and/or proteinmay be useful for treating such conditions.

The present invention also provides kits useful for producing theinventive supercharged protein or supercharged protein/agent complexesor compositions thereof, and/or using such complexes to transfect ordeliver the supercharged protein or an agent into a cell. The inventivekits may also include instructions for administering or using theinventive supercharged proteins or complexes, or a pharmaceuticalcomposition thereof. For example, the kit may include instructions forprescribing the pharmaceutical composition to a subject. The kit mayinclude enough materials for multiple unit doses of the agent. The kitmay be designed for therapeutic or research purposes. The kit mayoptionally include the agent (e.g. siRNA, peptide, drug) to bedelivered, or the agent may be provided by the end user.

The present invention also provides a method of introducing asupercharged protein or an agent associated with a supercharged protein,or both, into a cell. The inventive method comprises contacting thesupercharged protein, or a supercharged protein and an agent associatedwith the supercharged protein with the cell, e.g., under conditionssufficient to allow penetration of said supercharged protein, or anagent associated with a supercharged protein, into the cell, therebyintroducing a supercharged protein, or an agent associated with asupercharged protein, or both, into a cell. In certain embodiments,sufficient supercharged protein or agent enters the cell to allow forone or more of detection of the supercharged protein or agent in thecell; a change in a biological property of the cell, e.g., growth rate,pattern of gene expression, or viability, of the cell; or detection of abiological effect of the supercharged protein or agent. In certainembodiments, the contact is performed in vitro. In certain embodiments,the contact is performed in vivo, e.g., in the body of a subject, e.g.,a human or other animal. In one in vivo embodiment, sufficientsupercharged protein, agent, or both is present in the cell to provide adetectable effect in the subject, e.g., a therapeutic effect. In one invivo embodiment, sufficient supercharged protein, agent, or both ispresent in the cell to allow imaging of one or more penetrated cells ortissues. In certain embodiments, the observed or detectable effectarises from cell penetration.

The present invention also provides a method of evaluating asupercharged protein for cell penetration comprising: optionally,selecting a supercharged protein; providing said supercharged protein;and contacting said supercharged protein with a cell and determining ifthe supercharged protein penetrates the cell, thereby providing anevaluation of a supercharged protein for cell penetration.

The present invention also provides a method of evaluating asupercharged protein for cell penetration comprising: selecting aprotein to be supercharged; obtaining a set of one or a plurality ofresidues to be varied to produce a supercharged protein, wherein the setwas generated by a method described herein (obtaining includesgenerating the set or receiving the identity of one or more members ofthe set from another party); providing (e.g., by making or receiving itfrom another party) a supercharged protein having said set of variedresidues; and contacting said supercharged protein with a cell anddetermining if the supercharged protein penetrates the cell, thereby ofevaluating a supercharged protein for cell penetration. The method canallow for a party to develop supercharged proteins or to collaboratewith others to do so.

In some embodiments, the present invention provides a superchargedprotein associated with a functional peptide or protein able topenetrate a cell and deliver the functional peptide or protein into thecell. The functional peptide or protein may be delivered through theplasma membrane, into cytoplasm, through the nuclear membrane, and/orinto the nucleus. The functional protein or peptide is associated withthe supercharged protein for delivery. In some embodiments, thesupercharged protein is covalently bound to the functional peptide orprotein. In some embodiments, the supercharged protein is bound to thefunctional protein or peptide via a peptide (amide) bond, in some casesforming a fusion protein. In some embodiments, the supercharged proteinand the functional protein or peptide are associated through a linkerconnecting the supercharged protein to the functional peptide orprotein. The linker may be cleavable or uncleavable. In someembodiments, the linker comprises an amide, ester, ether, carbon-carbon,or disulfide bond although any covalent bond in the chemical art may beused. In some embodiments, the linker comprises a labile bond, cleavageof which results in separation of the supercharged protein from thepeptide or protein to be delivered. In some embodiments, thesupercharged protein or linker is cleaved under conditions found in thecell (e.g., a reductive environment). In some embodiments, thesupercharged protein or the linker is cleaved by a cellular enzyme. Insome embodiments, the cellular enzyme is a cellular protease or acellular esterase. In some embodiments, the cellular protease is anendosomal protease or an endosomal esterase. In some embodiments, thecellular enzyme is specifically expressed in a target cell or cell type,resulting in preferential or specific endosomal release of thefunctional protein or peptide in the target cell or cell type. Thetarget sequence of the protease may be engineered into the linkerbetween the functional protein or peptide to be delivered and thesupercharged protein. In some embodiments, the target cell or cell typeis a cancer cell or cancer cell type, a cell or cell type of the immunesystem, or a pathologic or diseased cell or cell type. In someembodiments, the supercharged protein or the linker comprises an aminoacid sequence chosen from the group including X-AGVF-X (SEQ ID NO: 11),X-GFLG-X (SEQ ID NO: 12), X-FK-X (SEQ ID NO: 13), X-AL-X (SEQ ID NO:14), X-ALAL-X (SEQ ID NO: 15), or X-ALALA-X (SEQ ID NO: 16), wherein Xdenotes the supercharged protein or the functional peptide or protein.

In some embodiments, the functional protein or peptide to be deliveredinto a cell is a transcription factor, a tumor suppressor, adevelopmental regulator, a growth factor, a metastasis suppressor, apro-apoptotic protein, a zinc finger nuclease, or a recombinase. In someembodiments, the functional protein is p53, Rb (retinoblastoma protein),BRCA1, BRCA2, PTEN, APC, CD95, ST7, ST14, a BCL-2 family protein, acaspase; BRMS1, CRSP3, DRG1, KAI1, KISS1, NM23, a TIMP-family protein, aBMP-family growth factor, EGF, EPO, FGF, G-CSF, GM-CSF, a GDF-familygrowth factor, HGF, HDGF, IGF, PDGF, TPO, TGF-α, TGF-β, VEGF; a zincfinger nuclease targeting a site within the human CCR5 gene, Cre, Dre,or FLP recombinase.

In certain embodiments, the invention provides methods of delivering apeptide or protein to a cell. In some embodiments, the method includes astep of contacting the cell with a supercharged protein associated withthe peptide or the protein to be delivered, under conditions sufficientfor the peptide or protein to enter the cell. In some embodiments, thesupercharged protein associated with the peptide or the protein is asupercharged protein associated with a functional peptide or protein.

In some embodiments, the peptide or protein to be delivered is a nuclearpeptide or protein and the method results in delivery of the protein orpeptide to the nucleus of the cell. In some embodiments, the proteindelivered to the cell is a transcription factor. In some embodiments,the protein delivered to the cell is a reprogramming factor. In someembodiments, the cell is a somatic cell from a subject diagnosed with adisease. In certain embodiments, the cell is a mammalian cell (e.g., ahuman cell). In some embodiments, the cell is contacted with asupercharged protein associated with a reprogramming factor in anamount, for a time, and under conditions sufficient to inducereprogramming of the cell to a pluripotent state or less differentiatedstate. In some embodiments, the method further includes a step ofisolating a pluripotent cell generated from a somatic cell. In someembodiments, the method further comprises a step of differentiating theisolated pluripotent cell, or progeny thereof, into a differentiatedcell type. In some embodiments, the method further comprises a step ofusing the pluripotent cell, or differentiated progeny thereof, in a cellreplacement therapeutic approach.

In some embodiments, the cell is a cell carrying an undesired genomicallele and the supercharged protein is associated with a nucleasespecifically targeting the allele. In some embodiments, the undesiredallele is associated with a disease, and the nuclease induces a mutationin the allele. In some embodiments, the cell is contacted ex vivo andthen reintroduced into the subject after successful targeting of theundesired allele by the nuclease. In some embodiments, the nuclease is azinc finger nuclease. In some embodiments, the nuclease targets thehuman CCR5 gene. In some embodiments, the subject is a subject diagnosedwith HIV/AIDS, and the cell is a T-lymphocyte.

In some embodiments, the protein is a recombinase, and the cell's genomecomprises a recombination site recognized by the recombinase. In someembodiments, the cell comprises a plurality of recombination sitesrecognized by the recombinase, and recombinase-mediated recombination ofthe plurality of recombination sites results in deletion of a region ofthe genome (e.g., a diseased gene).

In some embodiments, the cell is a tumor cell, and the protein is atumor suppressor protein, a metastasis suppressor protein, a cytostaticor a cytotoxic protein.

These and other aspects and embodiments of the invention, as well asvarious advantages and utilities will be more apparent with respect tothe drawings and detailed description of the invention.

DEFINITIONS

Agent to be delivered: As used herein, the phrase “agent to bedelivered” refers to any substance that can be delivered to a subject,organ, tissue, cell, subcellular locale, and/or extracellular matrixlocale. In some embodiments, the agent to be delivered is a biologicallyactive agent, i.e., it has activity in a biological system and/ororganism. For instance, a substance that, when administered to anorganism, has a biological effect on that organism, is considered to bebiologically active. In particular embodiments, where an agent to bedelivered is a biologically active agent, a portion of that agent thatshares at least one biological activity of the agent as a whole istypically referred to as a “biologically active” portion. In someembodiments, an agent to be delivered is a therapeutic agent. As usedherein, the term “therapeutic agent” refers to any agent that, whenadministered to a subject, has a beneficial effect. The term“therapeutic agent” refers to any agent that, when administered to asubject, has a therapeutic, diagnostic, and/or prophylactic effectand/or elicits a desired biological and/or pharmacological effect. Asused herein, the term “therapeutic agent” may be a nucleic acid that isdelivered to a cell by via its association with a supercharged protein.In certain embodiments, the agent to be delivered is a nucleic acid. Incertain embodiments, the agent to be delivered is DNA. In certainembodiments, the agent to be delivered is RNA. In certain embodiments,the agent to be delivered is a peptide or protein. In certainembodiments, the agent to be delivered is a small molecule. In someembodiments, the agent to be delivered is useful as an in vivo or invitro imaging agent. In some of these embodiments, it is, and in othersit is not, biologically active.

Animal: As used herein, the term “animal” refers to any member of theanimal kingdom. In some embodiments, “animal” refers to humans at anystage of development. In some embodiments, “animal” refers to non-humananimals at any stage of development. In certain embodiments, thenon-human animal is a mammal (e.g., a rodent, a mouse, a rat, a rabbit,a monkey, a dog, a cat, a sheep, cattle, a primate, or a pig). In someembodiments, animals include, but are not limited to, mammals, birds,reptiles, amphibians, fish, and worms. In some embodiments, the animalis a transgenic animal, genetically-engineered animal, or a clone.

Approximately: As used herein, the term “approximately” or “about,” asapplied to one or more values of interest, refers to a value that issimilar to a stated reference value. In certain embodiments, the term“approximately” or “about” refers to a range of values that fall within25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%,6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than orless than) of the stated reference value unless otherwise stated orotherwise evident from the context (except where such number wouldexceed 100% of a possible value).

Associated with: As used herein, the terms “associated with,”“conjugated,” “linked,” “attached,” and “tethered,” when used withrespect to two or more moieties, means that the moieties are physicallyassociated or connected with one another, either directly or via one ormore additional moieties that serves as a linking agent, to form astructure that is sufficiently stable so that the moieties remainphysically associated under the conditions in which the structure isused, e.g., physiological conditions. A supercharged protein istypically associated with a nucleic acid by a mechanism that involvesnon-covalent binding (e.g., electrostatic interactions). In certainembodiments, a positively charged, supercharged protein is associatedwith a nucleic acid through electrostatic interactions to form acomplex. In some embodiments, a sufficient number of weaker interactionscan provide sufficient stability for moieties to remain physicallyassociated under a variety of different conditions. In certainembodiments, the agent to be delivered is covalently bound to thesupercharged protein. In some embodiments, a peptide or protein isassociated with a supercharged protein by a covalent bond (e.g., anamide bond). In some embodiments, a peptide or protein is associatedwith a supercharged protein directly by a peptide bond, or indirectlyvia a linker.

Biocompatible: As used herein, the term “biocompatible” refers tosubstances that are not toxic to cells. In some embodiments, a substanceis considered to be “biocompatible” if its addition to cells in vivodoes not induce inflammation and/or other adverse effects in vivo. Insome embodiments, a substance is considered to be “biocompatible” if itsaddition to cells in vitro or in vivo results in less than or equal toabout 50%, about 45%, about 40%, about 35%, about 30%, about 25%, about20%, about 15%, about 10%, about 5%, or less than about 5% cell death.

Biodegradable: As used herein, the term “biodegradable” refers tosubstances that are degraded under physiological conditions. In someembodiments, a biodegradable substance is a substance that is brokendown by cellular machinery. In some embodiments, a biodegradablesubstance is a substance that is broken down by chemical processes.

Biologically active: As used herein, the phrase “biologically active”refers to a characteristic of any substance that has activity in abiological system and/or organism. For instance, a substance that, whenadministered to an organism, has a biological effect on that organism,is considered to be biologically active. In particular embodiments,where a nucleic acid is biologically active, a portion of that nucleicacid that shares at least one biological activity of the whole nucleicacid is typically referred to as a “biologically active” portion.

Carbohydrate: The term “carbohydrate” refers to a sugar or polymer ofsugars. The terms “saccharide,” “polysaccharide,” “carbohydrate,” and“oligosaccharide” may be used interchangeably. Most carbohydrates arealdehydes or ketones with many hydroxyl groups, usually one on eachcarbon atom of the molecule. Carbohydrates generally have the molecularformula C_(n)H_(2n)O_(n). A carbohydrate may be a monosaccharide, adisaccharide, trisaccharide, oligosaccharide, or polysaccharide. Themost basic carbohydrate is a monosaccharide, such as glucose, sucrose,galactose, mannose, ribose, arabinose, xylose, and fructose.Disaccharides are two joined monosaccharides. Exemplary disaccharidesinclude sucrose, maltose, cellobiose, and lactose. Typically, anoligosaccharide includes between three and six monosaccharide units(e.g., raffinose, stachyose), and polysaccharides include six or moremonosaccharide units. Exemplary polysaccharides include starch,glycogen, and cellulose. Carbohydrates may contain modified saccharideunits such as 2′-deoxyribose wherein a hydroxyl group is removed,2′-fluororibose wherein a hydroxyl group is replace with a fluorine, orN-acetylglucosamine, a nitrogen-containing form of glucose (e.g.,2′-fluororibose, deoxyribose, and hexose). Carbohydrates may exist inmany different forms, for example, conformers, cyclic forms, acyclicforms, stereoisomers, tautomers, anomers, and isomers.

Characteristic portion: As used herein, the term a “characteristicportion” of a substance, in the broadest sense, is one that shares somedegree of sequence and/or structural identity and/or at least onefunctional characteristic with the relevant intact substance. Forexample, a “characteristic portion” of a protein or polypeptide is onethat contains a continuous stretch of amino acids, or a collection ofcontinuous stretches of amino acids, that together are characteristic ofa protein or polypeptide. In some embodiments, each such continuousstretch generally will contain at least 2, at least 5, at least 10, atleast 15, at least 20; at least 50, or more amino acids. A“characteristic portion” of a nucleic acid is one that contains acontinuous stretch of nucleotides, or a collection of continuousstretches of nucleotides, that together are characteristic of a nucleicacid. In some embodiments, each such continuous stretch generally willcontain at least 2, at least 5, at least 10, at least 15, at least 20,at least 50, or more nucleotides. In some embodiments, a characteristicportion is biologically active.

Conserved: As used herein, the term “conserved” refers to nucleotides oramino acid residues of a polynucleotide sequence or amino acid sequence,respectively, that are those that occur unaltered in the same positionof two or more related sequences being compared. Nucleotides or aminoacids that are relatively conserved are those that are conserved amongstmore related sequences than nucleotides or amino acids appearingelsewhere in the sequences. In some embodiments, two or more sequencesare said to be “completely conserved” if they are 100% identical to oneanother. In some embodiments, two or more sequences are said to be“highly conserved” if they are at least 70% identical, at least 80%identical, at least 90% identical, or at least 95% identical to oneanother. In some embodiments, two or more sequences are said to be“highly conserved” if they are about 70% identical, about 80% identical,about 90% identical, about 95%, about 98%, or about 99% identical to oneanother. In some embodiments, two or more sequences are said to be“conserved” if they are at least 30% identical, at least 40% identical,at least 50% identical, at least 60% identical, at least 70% identical,at least 80% identical, at least 90% identical, or at least 95%identical to one another. In some embodiments, two or more sequences aresaid to be “conserved” if they are about 30% identical, about 40%identical, about 50% identical, about 60% identical, about 70%identical, about 80% identical, about 90% identical, about 95%identical, about 98% identical, or about 99% identical to one another.

Expression: As used herein, “expression” of a nucleic acid sequencerefers to one or more of the following events: (1) production of an RNAtemplate from a DNA sequence (e.g., by transcription); (2) processing ofan RNA transcript (e.g., by splicing, editing, 5′ cap formation, and/or3′ end processing); (3) translation of an RNA into a polypeptide orprotein; and (4) post-translational modification of a polypeptide orprotein.

Functional: As used herein, a “functional” biological molecule is abiological molecule in a form in which it exhibits a property and/oractivity by which it is characterized.

Fusion protein: As used herein, a “fusion protein” includes a firstprotein moiety, e.g., a supercharged protein, having a peptide linkagewith a second protein moiety. In certain embodiments, the fusion proteinis encoded by a single fusion gene.

Gene: As used herein, the term “gene” has its meaning as understood inthe art. It will be appreciated by those of ordinary skill in the artthat the term “gene” may include gene regulatory sequences (e.g.,promoters, enhancers, etc.) and/or intron sequences. It will further beappreciated that definitions of gene include references to nucleic acidsthat do not encode proteins but rather encode functional RNA moleculessuch as RNAi agents, ribozymes, tRNAs, etc. For the purpose of clarityit should be noted that, as used in the present application, the term“gene” generally refers to a portion of a nucleic acid that encodes aprotein; the term may optionally encompass regulatory sequences, as willbe clear from context to those of ordinary skill in the art. Thisdefinition is not intended to exclude application of the term “gene” tonon-protein-coding expression units but rather to clarify that, in mostcases, the term as used in this document refers to a protein-codingnucleic acid.

Gene product or expression product: As used herein, the term “geneproduct” or “expression product” generally refers to an RNA transcribedfrom the gene (pre- and/or post-processing) or a polypeptide (pre-and/or post-modification) encoded by an RNA transcribed from the gene.

Green fluorescent protein: As used herein, the term “green fluorescentprotein” (GFP) refers to a protein originally isolated from thejellyfish Aequorea victoria that fluoresces green when exposed to bluelight or a derivative of such a protein (e.g., a supercharged version ofthe protein). The amino acid sequence of wild type GFP is as follows:

(SEQ ID NO: 17) MSKGEELFTG VVPILVELDG DVNGHKFSVS GEGEGDATYGKLTLKFICTT GKLPVPWPTL VTTFSYGVQC FSRYPDHMKQHDFFKSAMPE GYVQERTIFF KDDGNYKTRA EVKFEGDTLVNRIELKGIDF KEDGNILGHK LEYNYNSHNV YIMADKQKNGIKVNFKIRHN IEDGSVQLAD HYQQNTPIGD GPVLLPDNHYLSTQSALSKD PNEKRDHMVL LEFVTAAGIT HGMDELYK.Proteins that are at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 95%, at least 98%, or at least 99%homologous are also considered to be green fluorescent proteins. Incertain embodiments, the green fluorescent protein is supercharged. Incertain embodiments, the green fluorescent protein is superpositivelycharged (e.g., +15 GFP, +25 GFP, and +36 GFP as described herein). Incertain embodiments, the GFP may be modified to include a polyhistidinetag for ease in purification of the protein. In certain embodiments, theGFP may be fused with another protein or peptide (e.g., hemagglutinin 2(HA2) peptide). In certain embodiments, the GFP may be further modifiedbiologically or chemically (e.g., post-translational modifications,proteolysis, etc.).

Homology: As used herein, the term “homology” refers to the overallrelatedness between polymeric molecules, e.g. between nucleic acidmolecules (e.g. DNA molecules and/or RNA molecules) and/or betweenpolypeptide molecules. In some embodiments, polymeric molecules areconsidered to be “homologous” to one another if their sequences are atleast 25%, at least 30%, at least 35%, at least 40%, at least 45%, atleast 50%, at least 55%, at least 60%, at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or atleast 99% identical. In some embodiments, polymeric molecules areconsidered to be “homologous” to one another if their sequences are atleast 25%, at least 30%, at least 35%, at least 40%, at least 45%, atleast 50%, at least 55%, at least 60%, at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or atleast 99% similar. The term “homologous” necessarily refers to acomparison between at least two sequences (nucleotides sequences oramino acid sequences). In accordance with the invention, two nucleotidesequences are considered to be homologous if the polypeptides theyencode are at least about 50% identical, at least about 60% identical,at least about 70% identical, at least about 80% identical, or at leastabout 90% identical for at least one stretch of at least about 20 aminoacids. In some embodiments, homologous nucleotide sequences arecharacterized by the ability to encode a stretch of at least 4-5uniquely specified amino acids. Both the identity and the approximatespacing of these amino acids relative to one another must be consideredfor nucleotide sequences to be considered homologous. For nucleotidesequences less than 60 nucleotides in length, homology is determined bythe ability to encode a stretch of at least 4-5 uniquely specified aminoacids. In accordance with the invention, two protein sequences areconsidered to be homologous if the proteins are at least about 50%identical, at least about 60% identical, at least about 70% identical,at least about 80% identical, or at least about 90% identical for atleast one stretch of at least about 20 amino acids.

Hydrophilic: As used herein, a “hydrophilic” substance is a substancethat may be soluble in polar dispersion media. In some embodiments, ahydrophilic substance can transiently bond with polar dispersion media.In some embodiments, a hydrophilic substance transiently bonds withpolar dispersion media through hydrogen bonding. In some embodiments,the polar dispersion medium is water. In some embodiments, a hydrophilicsubstance may be ionic. In some embodiments, a hydrophilic substance maybe non-ionic. In some embodiments, a substance is hydrophilic relativeto another substance because it is more soluble in water, polardispersion media, or hydrophilic dispersion media than is the othersubstance. In some embodiments, a substance is hydrophilic relative toanother substance because it is less soluble in oil, non-polardispersion media, or hydrophobic dispersion media than is the othersubstance.

Hydrophobic: As used herein, a “hydrophobic” substance is a substancethat may be soluble in non-polar dispersion media. In some embodiments,a hydrophobic substance is repelled from polar dispersion media. In someembodiments, the polar dispersion medium is water. In some embodiments,hydrophobic substances are non-polar. In some embodiments, a substanceis hydrophobic relative to another substance because it is more solublein oil, non-polar dispersion media, or hydrophobic dispersion media thanis the other substance. In some embodiments, a substance is hydrophobicrelative to another substance because it is less soluble in water, polardispersion media, or hydrophilic dispersion media than is the othersubstance.

Identity: As used herein, the term “identity” refers to the overallrelatedness between polymeric molecules, e.g., between nucleic acidmolecules (e.g. DNA molecules and/or RNA molecules) and/or betweenpolypeptide molecules. Calculation of the percent identity of twonucleic acid sequences, for example, can be performed by aligning thetwo sequences for optimal comparison purposes (e.g., gaps can beintroduced in one or both of a first and a second nucleic acid sequencesfor optimal alignment and non-identical sequences can be disregarded forcomparison purposes). In certain embodiments, the length of a sequencealigned for comparison purposes is at least 30%, at least 40%, at least50%, at least 60%, at least 70%, at least 80%, at least 90%, at least95%, or 100% of the length of the reference sequence. The nucleotides atcorresponding nucleotide positions are then compared. When a position inthe first sequence is occupied by the same nucleotide as thecorresponding position in the second sequence, then the molecules areidentical at that position. The percent identity between the twosequences is a function of the number of identical positions shared bythe sequences, taking into account the number of gaps, and the length ofeach gap, which needs to be introduced for optimal alignment of the twosequences. The comparison of sequences and determination of percentidentity between two sequences can be accomplished using a mathematicalalgorithm. For example, the percent identity between two nucleotidesequences can be determined using methods such as those described inComputational Molecular Biology, Lesk, A. M., ed., Oxford UniversityPress, New York, 1988; Biocomputing: Informatics and Genome Projects,Smith, D. W., ed., Academic Press, New York, 1993; Sequence Analysis inMolecular Biology, von Heinje, G., Academic Press, 1987; ComputerAnalysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G.,eds., Humana Press, New Jersey, 1994; and Sequence Analysis Primer,Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991;each of which is incorporated herein by reference. For example, thepercent identity between two nucleotide sequences can be determinedusing the algorithm of Meyers and Miller (CABIOS, 1989, 4:11-17), whichhas been incorporated into the ALIGN program (version 2.0) using aPAM120 weight residue table, a gap length penalty of 12 and a gappenalty of 4. The percent identity between two nucleotide sequences can,alternatively, be determined using the GAP program in the GCG softwarepackage using an NWSgapdna.CMP matrix. Methods commonly employed todetermine percent identity between sequences include, but are notlimited to those disclosed in Carillo, H., and Lipman, D., SIAM JApplied Math.; 48:1073 (1988); incorporated herein by reference.Techniques for determining identity are codified in publicly availablecomputer programs. Exemplary computer software to determine homologybetween two sequences include, but are not limited to, GCG programpackage, Devereux, J., et al., Nucleic Acids Research, 12(1), 387(1984)), BLASTP, BLASTN, and FASTA Atschul, S. F. et al., J. Molec.Biol., 215, 403 (1990)).

Inhibit expression of a gene: As used herein, the phrase “inhibitexpression of a gene” means to cause a reduction in the amount of anexpression product of the gene. The expression product can be an RNAtranscribed from the gene (e.g., an mRNA) or a polypeptide translatedfrom an mRNA transcribed from the gene. Typically a reduction in thelevel of an mRNA results in a reduction in the level of a polypeptidetranslated therefrom. The level of expression may be determined usingstandard techniques for measuring mRNA or protein.

In vitro: As used herein, the term “in vitro” refers to events thatoccur in an artificial environment, e.g., in a test tube or reactionvessel, in cell culture, in a Petri dish, etc., rather than within anorganism (e.g., animal, plant, or microbe).

In vivo: As used herein, the term “in vivo” refers to events that occurwithin an organism (e.g., animal, plant, or microbe).

Isolated: As used herein, the term “isolated” refers to a substance orentity that has been (1) separated from at least some of the componentswith which it was associated when initially produced (whether in natureor in an experimental setting), and/or (2) produced, prepared, and/ormanufactured by the hand of man. Isolated substances and/or entities maybe separated from at least about 10%, about 20%, about 30%, about 40%,about 50%, about 60%, about 70%, about 80%, about 90%, or more of theother components with which they were initially associated. In someembodiments, isolated agents are more than about 80%, about 85%, about90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%,about 97%, about 98%, about 99%, or more than about 99% pure. As usedherein, a substance is “pure” if it is substantially free of othercomponents.

microRNA (miRNA): As used herein, the term “microRNA” or “miRNA” refersto an RNAi agent that is approximately 21 nucleotides (nt)-23 nt inlength. miRNAs can range between 18 nt-26 nt in length. Typically,miRNAs are single-stranded. However, in some embodiments, miRNAs may beat least partially double-stranded. In certain embodiments, miRNAs maycomprise an RNA duplex (referred to herein as a “duplex region”) and mayoptionally further comprises one to three single-stranded overhangs. Insome embodiments, an RNAi agent comprises a duplex region ranging from15 bp to 29 bp in length and optionally further comprising one or twosingle-stranded overhangs. An miRNA may be formed from two RNA moleculesthat hybridize together, or may alternatively be generated from a singleRNA molecule that includes a self-hybridizing portion. In general, free5′ ends of miRNA molecules have phosphate groups, and free 3′ ends havehydroxyl groups. The duplex portion of an miRNA usually, but does notnecessarily, comprise one or more bulges consisting of one or moreunpaired nucleotides. One strand of an miRNA includes a portion thathybridizes with a target RNA. In certain embodiments, one strand of themiRNA is not precisely complementary with a region of the target RNA,meaning that the miRNA hybridizes to the target RNA with one or moremismatches. In some embodiments, one strand of the miRNA is preciselycomplementary with a region of the target RNA, meaning that the miRNAhybridizes to the target RNA with no mismatches. Typically, miRNAs arethought to mediate inhibition of gene expression by inhibitingtranslation of target transcripts. However, in some embodiments, miRNAsmay mediate inhibition of gene expression by causing degradation oftarget transcripts.

Nucleic acid: As used herein, the term “nucleic acid,” in its broadestsense, refers to any compound and/or substance that is or can beincorporated into an oligonucleotide chain. In some embodiments, anucleic acid is a compound and/or substance that is or can beincorporated into an oligonucleotide chain via a phosphodiester linkage.In some embodiments, “nucleic acid” refers to individual nucleic acidresidues (e.g. nucleotides and/or nucleosides). In some embodiments,“nucleic acid” refers to an oligonucleotide chain comprising individualnucleic acid residues. As used herein, the terms “oligonucleotide” and“polynucleotide” can be used interchangeably to refer to a polymer ofnucleotides (e.g., a string of at least two nucleotides). In someembodiments, “nucleic acid” encompasses RNA as well as single and/ordouble-stranded DNA and/or cDNA. Furthermore, the terms “nucleic acid,”“DNA,” “RNA,” and/or similar terms include nucleic acid analogs, i.e.analogs having other than a phosphodiester backbone. For example, theso-called “peptide nucleic acids,” which are known in the art and havepeptide bonds instead of phosphodiester bonds in the backbone, areconsidered within the scope of the present invention. The term“nucleotide sequence encoding an amino acid sequence” includes allnucleotide sequences that are degenerate versions of each other and/orencode the same amino acid sequence. Nucleotide sequences that encodeproteins and/or RNA may include introns. Nucleic acids can be purifiedfrom natural sources, produced using recombinant expression systems andoptionally purified, chemically synthesized, etc. Where appropriate,e.g., in the case of chemically synthesized molecules, nucleic acids cancomprise nucleoside analogs such as analogs having chemically modifiedbases or sugars, backbone modifications, etc. A nucleic acid sequence ispresented in the 5′ to 3′ direction unless otherwise indicated. The term“nucleic acid segment” is used herein to refer to a nucleic acidsequence that is a portion of a longer nucleic acid sequence. In manyembodiments, a nucleic acid segment comprises at least 3, at least 4, atleast 5, at least 6, at least 7, at least 8, at least 9, at least 10, ormore residues. In some embodiments, a nucleic acid is or comprisesnatural nucleosides (e.g. adenosine, thymidine, guanosine, cytidine,uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, anddeoxycytidine); nucleoside analogs (e.g., 2-aminoadenosine,2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine,5-methylcytidine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine,C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine,C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine,8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and 2-thiocytidine);chemically modified bases; biologically modified bases (e.g., methylatedbases); intercalated bases; modified sugars (e.g., 2′-fluororibose,ribose, 2′-deoxyribose, arabinose, and hexose); and/or modifiedphosphate groups (e.g., phosphorothioates and 5′-N-phosphoramiditelinkages). In some embodiments, the present invention is specificallydirected to “unmodified nucleic acids,” meaning nucleic acids (e.g.polynucleotides and residues, including nucleotides and/or nucleosides)that have not been chemically modified in order to facilitate or achievedelivery.

Polymer: As used herein, the term “polymer” refers to any substancecomprising at least two repeating structural units (i.e., “monomers”)which are associated with one another. In some embodiments, monomers arecovalently associated with one another. In some embodiments, monomersare non-covalently associated with one another. Polymers may behomopolymers or copolymers comprising two or more monomers. In terms ofsequence, copolymers may be random, block, graft, or comprise acombination of random, block, and/or graft sequences. In someembodiments, block copolymers are diblock copolymers. In someembodiments, block copolymers are triblock copolymers. In someembodiments, polymers can be linear or branched polymers. In someembodiments, polymers in accordance with the invention comprise blends,mixtures, and/or adducts of any of the polymers described herein.Typically, polymers in accordance with the present invention are organicpolymers. In some embodiments, polymers are hydrophilic. In someembodiments, polymers are hydrophobic. In some embodiments, polymersmodified with one or more moieties and/or functional groups.

Protein: As used herein, the term “protein” refers to a polypeptide(i.e., a string of at least two amino acids linked to one another bypeptide bonds). Proteins may include moieties other than amino acids(e.g., may be glycoproteins) and/or may be otherwise processed ormodified. Those of ordinary skill in the art will appreciate that a“protein” can be a complete polypeptide chain as produced by a cell(with or without a signal sequence), or can be a functional portionthereof. Those of ordinary skill will further appreciate that a proteincan sometimes include more than one polypeptide chain, for examplelinked by one or more disulfide bonds or associated by other means.Polypeptides may contain L-amino acids, D-amino acids, or both and maycontain any of a variety of amino acid modifications or analogs known inthe art. Useful modifications include, e.g., addition of a chemicalentity such as a carbohydrate group, a phosphate group, a farnesylgroup, an isofarnesyl group, a fatty acid group, an amide group, aterminal acetyl group, a linker for conjugation, functionalization, orother modification (e.g., alpha amidation), etc. In a preferredembodiment, the modifications of the peptide lead to a more stablepeptide (e.g., greater half-life in vivo). These modifications mayinclude cyclization of the peptide, the incorporation of D-amino acids,etc. None of the modifications should substantially interfere with thedesired biological activity of the peptide. In certain embodiments, themodifications of the peptide lead to a more biologically active peptide.In some embodiments, polypeptides may comprise natural amino acids,non-natural amino acids, synthetic amino acids, amino acid analogs, andcombinations thereof. The term “peptide” is typically used to refer to apolypeptide having a length of less than about 100 amino acids.

Reprogramming factor: As used herein, the term “reprogramming factor”refers to a factor that, alone or in combination with other factors, canchange the state of a cell from a somatic, differentiated state into apluripotent stem cell state. Non limiting examples of reprogrammingfactors include a protein, e.g., a transcription factor, a peptide, anucleic acid, or a small molecule.

RNA interference (RNAi): As used herein, the term “RNA interference” or“RNAi” refers to sequence-specific inhibition of gene expression and/orreduction in target RNA levels mediated by an RNA, which RNA comprises aportion that is substantially complementary to a target RNA. Typically,at least part of the substantially complementary portion is within thedouble stranded region of the RNA. In some embodiments, RNAi can occurvia selective intracellular degradation of RNA. In some embodiments,RNAi can occur by translational repression.

RNAi agent: As used herein, the term “RNAi agent” or “RNAi” refers to anRNA, optionally including one or more nucleotide analogs ormodifications, having a structure characteristic of molecules that canmediate inhibition of gene expression through an RNAi mechanism. In someembodiments, RNAi agents mediate inhibition of gene expression bycausing degradation of target transcripts. In some embodiments, RNAiagents mediate inhibition of gene expression by inhibiting translationof target transcripts. Generally, an RNAi agent includes a portion thatis substantially complementary to a target RNA. In some embodiments,RNAi agents are at least partly double-stranded. In some embodiments,RNAi agents are single-stranded. In some embodiments, exemplary RNAiagents can include siRNA, shRNA, and/or miRNA. In some embodiments, RNAiagents may be composed entirely of natural RNA nucleotides (i.e.,adenine, guanine, cytosine, and uracil). In some embodiments, RNAiagents may include one or more non-natural RNA nucleotides (e.g.,nucleotide analogs, DNA nucleotides, etc.). Inclusion of non-natural RNAnucleic acid residues may be used to make the RNAi agent more resistantto cellular degradation than RNA. In some embodiments, the term “RNAiagent” may refer to any RNA, RNA derivative, and/or nucleic acidencoding an RNA that induces an RNAi effect (e.g., degradation of targetRNA and/or inhibition of translation). In some embodiments, an RNAiagent may comprise a blunt-ended (i.e., without overhangs) dsRNA thatcan act as a Dicer substrate. For example, such an RNAi agent maycomprise a blunt-ended dsRNA which is ≧25 base pairs length, which mayoptionally be chemically modified to abrogate an immune response.

RNAi-inducing agent: As used herein, the term “RNAi-inducing agent”encompasses any entity that delivers, regulates, and/or modifies theactivity of an RNAi agent. In some embodiments, RNAi-inducing agents mayinclude vectors (other than naturally occurring molecules not modifiedby the hand of man) whose presence within a cell results in RNAi andleads to reduced expression of a transcript to which the RNAi-inducingagent is targeted. In some embodiments, RNAi-inducing agents areRNAi-inducing vectors. In some embodiments, RNAi-inducing agents arecompositions comprising RNAi agents and one or more pharmaceuticallyacceptable excipients and/or carriers. In some embodiments, anRNAi-inducing agent is an “RNAi-inducing vector,” which refers to avector whose presence within a cell results in production of one or moreRNAs that self-hybridize or hybridize to each other to form an RNAiagent (e.g. siRNA, shRNA, and/or miRNA). In various embodiments, thisterm encompasses plasmids, e.g., DNA vectors (whose sequence maycomprise sequence elements derived from a virus), or viruses (other thannaturally occurring viruses or plasmids that have not been modified bythe hand of man), whose presence within a cell results in production ofone or more RNAs that self-hybridize or hybridize to each other to forman RNAi agent. In general, the vector comprises a nucleic acid operablylinked to expression signal(s) so that one or more RNAs that hybridizeor self-hybridize to form an RNAi agent are transcribed when the vectoris present within a cell. Thus the vector provides a template forintracellular synthesis of the RNA or RNAs or precursors thereof. Forpurposes of inducing RNAi, presence of a viral genome in a cell (e.g.,following fusion of the viral envelope with the cell membrane) isconsidered sufficient to constitute presence of the virus within thecell. In addition, for purposes of inducing RNAi, a vector is consideredto be present within a cell if it is introduced into the cell, entersthe cell, or is inherited from a parental cell, regardless of whether itis subsequently modified or processed within the cell. An RNAi-inducingvector is considered to be targeted to a transcript if presence of thevector within a cell results in production of one or more RNAs thathybridize to each other or self-hybridize to form an RNAi agent that istargeted to the transcript, i.e., if presence of the vector within acell results in production of one or more RNAi agents targeted to thetranscript.

Short, interfering RNA (siRNA): As used herein, the term “short,interfering RNA” or “siRNA” refers to an RNAi agent comprising an RNAduplex (referred to herein as a “duplex region”) that is approximately19 base pairs (bp) in length and optionally further comprises one tothree single-stranded overhangs. In some embodiments, an RNAi agentcomprises a duplex region ranging from 15 bp to 29 bp in length andoptionally further comprising one or two single-stranded overhangs. AnsiRNA may be formed from two RNA molecules that hybridize together, ormay alternatively be generated from a single RNA molecule that includesa self-hybridizing portion. In general, free 5′ ends of siRNA moleculeshave phosphate groups, and free 3′ ends have hydroxyl groups. The duplexportion of an siRNA may, but typically does not, comprise one or morebulges consisting of one or more unpaired nucleotides. One strand of ansiRNA includes a portion that hybridizes with a target transcript. Incertain embodiments, one strand of the siRNA is precisely complementarywith a region of the target transcript, meaning that the siRNAhybridizes to the target transcript without a single mismatch. In someembodiments, one or more mismatches between the siRNA and the targetedportion of the target transcript may exist. In some embodiments in whichperfect complementarity is not achieved, any mismatches are generallylocated at or near the siRNA termini. In some embodiments, siRNAsmediate inhibition of gene expression by causing degradation of targettranscripts.

Short hairpin RNA (shRNA): As used herein, the term “short hairpin RNA”or “shRNA” refers to an RNAi agent comprising an RNA having at least twocomplementary portions hybridized or capable of hybridizing to form adouble-stranded (duplex) structure sufficiently long to mediate RNAi(typically at least approximately 19 bp in length), and at least onesingle-stranded portion, typically ranging between approximately 1nucleotide (nt) and approximately 10 nt in length that forms a loop. Insome embodiments, an shRNA comprises a duplex portion ranging from 15 bpto 29 bp in length and at least one single-stranded portion, typicallyranging between approximately 1 nt and approximately 10 nt in lengththat forms a loop. The duplex portion may, but typically does not,comprise one or more bulges consisting of one or more unpairednucleotides. In some embodiments, siRNAs mediate inhibition of geneexpression by causing degradation of target transcripts. shRNAs arethought to be processed into siRNAs by the conserved cellular RNAimachinery. Thus shRNAs may be precursors of siRNAs. Regardless, siRNAsin general are capable of inhibiting expression of a target RNA, similarto siRNAs.

Small molecule: In general, a “small molecule” refers to a substantiallynon-peptidic, non-oligomeric organic compound either prepared in thelaboratory or found in nature. Small molecules, as used herein, canrefer to compounds that are “natural product-like,” however, the term“small molecule” is not limited to “natural product-like” compounds.Rather, a small molecule is typically characterized in that it containsseveral carbon-carbon bonds, and has a molecular weight of less than1500 g/mol, less than 1250 g/mol, less than 1000 g/mol, less than 750g/mol, less than 500 g/mol, or less than 250 g/mol, although thischaracterization is not intended to be limiting for the purposes of thepresent invention. In certain other embodiments, natural-product-likesmall molecules are utilized.

Similarity: As used herein, the term “similarity” refers to the overallrelatedness between polymeric molecules, e.g. between nucleic acidmolecules (e.g. DNA molecules and/or RNA molecules) and/or betweenpolypeptide molecules. Calculation of percent similarity of polymericmolecules to one another can be performed in the same manner as acalculation of percent identity, except that calculation of percentsimilarity takes into account conservative substitutions as isunderstood in the art.

Stable: As used herein, the term “stable” as applied to a protein refersto any aspect of protein stability. The stable modified protein ascompared to the original unmodified protein possesses any one or more ofthe following characteristics: more soluble, more resistant toaggregation, more resistant to denaturation, more resistant tounfolding, more resistant to improper or undesired folding, greaterability to renature, increased thermal stability, increased stability ina variety of environments (e.g., pH, salt concentration, presence ofdetergents, presence of denaturing agents, etc.), and increasedstability in non-aqueous environments. In certain embodiments, thestable modified protein exhibits at least two of the abovecharacteristics. In certain embodiments, the stable modified proteinexhibits at least three of the above characteristics. Suchcharacteristics may allow the active protein to be produced at higherlevels. For example, the modified protein can be overexpressed at ahigher level without aggregation than the unmodified version of theprotein. Such characteristics may also allow the protein to be used as atherapeutic agent or a research tool.

Subject: As used herein, the term “subject” or “patient” refers to anyorganism to which a composition in accordance with the invention may beadministered, e.g., for experimental, diagnostic, prophylactic, and/ortherapeutic purposes. Typical subjects include animals (e.g., mammalssuch as mice, rats, rabbits, non-human primates, and humans) and/orplants.

Substantially: As used herein, the term “substantially” refers to thequalitative condition of exhibiting total or near-total extent or degreeof a characteristic or property of interest. One of ordinary skill inthe biological arts will understand that biological and chemicalphenomena rarely, if ever, go to completion and/or proceed tocompleteness or achieve or avoid an absolute result. The term“substantially” is therefore used herein to capture the potential lackof completeness inherent in many biological and chemical phenomena.

Suffering from: An individual who is “suffering from” a disease,disorder, and/or condition has been diagnosed with or displays one ormore symptoms of a disease, disorder, and/or condition.

Supercharge: As used herein, the term “supercharge” refers to anymodification of a protein that results in the increase or decrease ofthe overall net charge of the protein. Modifications include, but arenot limited to, alterations in amino acid sequence or addition ofcharged moieties (e.g., carboxylic acid groups, phosphate groups,sulfate groups, amino groups). Supercharging also refers to theassociation of an agent with a charged protein, naturally occurring ormodified, to form a complex with increased or decreased charge relativeto the agent alone.

Supercharged complex: As defined herein, a “supercharged complex” refersto the combination of one or more agents associated with a superchargedprotein, engineered or naturally occurring, that collectively has anincreased or decreased charge relative to the agent alone.

Susceptible to: An individual who is “susceptible to” a disease,disorder, and/or condition has not been diagnosed with and/or may notexhibit symptoms of the disease, disorder, and/or condition. In someembodiments, an individual who is susceptible to a disease, disorder,and/or condition (for example, cancer) may be characterized by one ormore of the following: (1) a genetic mutation associated withdevelopment of the disease, disorder, and/or condition; (2) a geneticpolymorphism associated with development of the disease, disorder,and/or condition; (3) increased and/or decreased expression and/oractivity of a protein and/or nucleic acid associated with the disease,disorder, and/or condition; (4) habits and/or lifestyles associated withdevelopment of the disease, disorder, and/or condition; (5) a familyhistory of the disease, disorder, and/or condition; and (6) exposure toand/or infection with a microbe associated with development of thedisease, disorder, and/or condition. In some embodiments, an individualwho is susceptible to a disease, disorder, and/or condition will developthe disease, disorder, and/or condition. In some embodiments, anindividual who is susceptible to a disease, disorder, and/or conditionwill not develop the disease, disorder, and/or condition.

Targeting agent or targeting moiety: As used herein, the term “targetingagent” or “targeting moiety” refers to any substance that binds to acomponent associated with a cell, tissue, and/or organ. Such a componentis referred to as a “target” or a “marker.” A targeting agent ortargeting moiety may be a polypeptide, glycoprotein, nucleic acid, smallmolecule, carbohydrate, lipid, etc. In some embodiments, a targetingagent or targeting moiety is an antibody or characteristic portionthereof. In some embodiments, a targeting agent or targeting moiety is areceptor or characteristic portion thereof. In some embodiments, atargeting agent or targeting moiety is a ligand or characteristicportion thereof. In some embodiments, a targeting agent or targetingmoiety is a nucleic acid targeting agent (e.g. an aptamer) that binds toa cell type specific marker. In some embodiments, a targeting agent ortargeting moiety is an organic small molecule. In some embodiments, atargeting agent or targeting moiety is an inorganic small molecule.

Target gene: As used herein, the term “target gene” refers to any genewhose expression is altered by an RNAi or other agent.

Target transcript: As used herein, the term “target transcript” refersto any mRNA transcribed from a target gene.

Therapeutically effective amount: As used herein, the term“therapeutically effective amount” means an amount of an agent to bedelivered (e.g., nucleic acid, drug, therapeutic agent, diagnosticagent, prophylactic agent, etc.) that is sufficient, when administeredto a subject suffering from or susceptible to a disease, disorder,and/or condition, to treat, improve symptoms of, diagnose, prevent,and/or delay the onset of the disease, disorder, and/or condition.

Transcription factor: As used herein, the term “transcription factor”refers to a DNA-binding protein that regulates transcription of DNA intoRNA, for example, by activation or repression of transcription. Sometranscription factors effect regulation of transcription alone, whileothers act in concert with other proteins. Some transcription factor canboth activate and repress transcription under certain conditions. Ingeneral, transcription factors bind a specific target sequence orsequences highly similar to a specific consensus sequence in aregulatory region of a target gene. Transcription factors may regulatetranscription of a target gene alone or in a complex with othermolecules.

Treating: As used herein, the term “treating” refers to partially orcompletely alleviating, ameliorating, improving, relieving, delayingonset of, inhibiting progression of, reducing severity of, and/orreducing incidence of one or more symptoms or features of a particulardisease, disorder, and/or condition. For example, “treating” cancer mayrefer to inhibiting survival, growth, and/or spread of a tumor.Treatment may be administered to a subject who does not exhibit signs ofa disease, disorder, and/or condition and/or to a subject who exhibitsonly early signs of a disease, disorder, and/or condition for thepurpose of decreasing the risk of developing pathology associated withthe disease, disorder, and/or condition. In some embodiments, treatmentcomprises delivery of a supercharged protein associated with atherapeutically active nucleic acid to a subject in need thereof.

Unmodified: As used herein, “unmodified” refers to the protein or agentprior to being supercharged or associated in a complex with asupercharged protein, engineered or naturally occurring.

Vector: As used herein, “vector” refers to a nucleic acid molecule whichcan transport another nucleic acid to which it has been linked. In someembodiment, vectors can achieve extra-chromosomal replication and/orexpression of nucleic acids to which they are linked in a host cell suchas a eukaryotic and/or prokaryotic cell. Vectors capable of directingthe expression of operatively linked genes are referred to herein as“expression vectors.”

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Supercharged green fluorescent proteins (GFPs). (A) Proteinsequences of GFP variants, with fluorophore-forming residues highlightedgreen, negatively charged residues highlighted red, and positivelycharged residues highlighted blue. (B-D) Electrostatic surfacepotentials of sfGFP(B), GFP(+36) (C), and GFP(−30) (D), colored from −25kT/e (red) to +25 kT/e (blue).

FIG. 2. Intramolecular properties of GFP variants. (A) Staining and UVfluorescence of purified GFP variants. Each lane and tube contains 0.2μg of protein. (B) Circular dichroism spectra of GFP variants. (C)Thermodynamic stability of GFP variants, measured by guanidinium-inducedunfolding.

FIG. 3. Intermolecular properties of supercharged proteins. (A)UV-illuminated samples of purified GFP variants (“native”), thosesamples heated 1 minute at 100° C. (“boiled”), and those samplessubsequently cooled for 2 hours at 25° C. (“cooled”). (B) Aggregation ofGFP variants was induced with 40% TFE at 25° C. and monitored byright-angle light scattering. (C) Supercharged GFPs adhere reversibly tooppositely charged macromolecules. Sample 1: 6 μg of GFP(+36) in 30 μlof 25 mM Tris pH 7.0 and 100 mM NaCl. Sample 2: 6 μg of GFP(−30) addedto sample 1. Sample 3: 30 μg of salmon sperm DNA added to sample 1.Sample 4: 20 μg of E. coli tRNA added to sample 1. Sample 5: Addition of1 M NaCl to sample 4. Samples 6-8: identical to samples 1, 2, and 4,respectively, except using sfGFP instead of GFP(+36). All samples werespun briefly in a microcentrifuge and visualized under UV light.

FIG. 4. (A) Excitation and (B) emission spectra of GFP variants. Eachsample contained an equal amount of protein as quantitated bychromophore absorbance at 490 nm.

FIG. 5. Supercharged Surfaces Dominate Intermolecular Interactions.Supercharged GFPs adhere non-specifically and reversibly with oppositelycharged macromolecules (“protein Velcro”). Such interactions can resultin the formation of precipitates. Unlike aggregates of denaturedproteins, these precipitates contain folded, fluorescent GFP anddissolve in 1 M salt. Shown here are: +36 GFP alone; +36 GFP mixed with−30 GFP; +36 GFP mixed with tRNA; +36 GFP mixed with tRNA in 1 M NaCl;sf GFP (−7); and sfGFP mixed with −30 GFP.

FIG. 6. Superpositive GFP Binds siRNA. GFP-siRNA complex does notco-migrate with siRNA in an agarose gel—+36 GFP was incubated withsiRNA, and the resulting complexes were subjected to agarose gelelectrophoresis. Various +36 GFP:siRNA ratios were tested in this assay:0:1, 1:1, 1:2, 1:3, 1:4, 1:5, and 1:10. +36 GFP was shown to form astable complex with siRNA in a ˜1:3 stoichiometry. Non-superpositiveproteins were shown not to bind siRNA. A 50:1 ratio of sfGFP:siRNA wastested, but, even at such high levels of excess, sfGFP did not associatewith siRNA.

FIG. 7. Superpositive GFP Penetrates Cells. HeLa cells were incubatedwith GFP (either sf GFP (−7), −30 GFP, or +36 GFP), washed, fixed, andstained. +36 GFP, but not sfGFP or −30 GFP, potently penetrated HeLacells. Left: DAPI staining of DNA to mark cells. Middle: GFP staining tomark where cellular uptake of GFP occurred. Right: movie showing +36 GFPlocalization as it occurs.

FIG. 8. Superpositive GFP Delivers siRNA into Human Cells. +36 GFP wasshown to potently deliver siRNA into HeLa cells. Left: Lipofectamine2000 and Cy3-siRNA; right: +36 GFP and Cy3-siRNA. +36 GFP was shown topotently deliver siRNA into HeLa cells. Hoechst channel, blue, was usedto visualize DNA, thereby marking the position of cells; Cy3 channel,red, was used to visualize Cy3-tagged siRNA; GFP channel, green, wasused to visualize GFP; yellow indicates sites of co-localization betweensiRNA and GFP.

FIG. 9. Delivery of siRNA into Cell Lines Resistant to TraditionalTransfection: murine 3T3-L₁ pre-adipocyte cells (“3T3L cells”). 3T3Lcells were treated with either: lipofectamine 2000 and Cy3-siRNA (left);or +36 GFP and Cy3-siRNA (right). 3T3L cells were poorly transfected byLipofectamine but were efficiently transfected by +36 GFP. Hoechstchannel, blue, was used to visualize DNA, thereby marking the positionof cells; Cy3 channel, red, was used to visualize Cy3-tagged siRNA; GFPchannel, green, was used to visualize GFP. Yellow indicates sites ofco-localization between siRNA and GFP.

FIG. 10. Delivery of siRNA into Cell Lines Resistant to TraditionalTransfection: rat IMCD cells. Rat IMCD cells were treated with eitherLipofectamine 2000 and Cy3-siRNA (left); or +36 GFP and Cy3-siRNA(right). Rat IMCD cells were poorly transfected by Lipofectamine butwere efficiently transfected by +36 GFP. Hoechst channel, blue, was usedto visualize DNA, thereby marking the position of cells; Cy3 channel,red, was used to visualize Cy3-tagged siRNA; GFP channel, green, wasused to visualize GFP. Yellow indicates sites of co-localization betweensiRNA and GFP.

FIG. 11. Delivery of siRNA into Cell Lines Resistant to TraditionalTransfection: human ST14A neurons. Human ST14A neurons were treated witheither Lipofectamine 2000 and Cy3-siRNA (left); or +36 GFP and Cy3-siRNA(right). Human ST14A neurons were poorly transfected by Lipofectaminebut were efficiently transfected by +36 GFP. DAPI channel, blue, wasused to visualize DNA, thereby marking the position of cells; Cy3channel, red, was used to visualize Cy3-tagged siRNA; GFP channel,green, was used to visualize GFP. Yellow indicates sites ofco-localization between siRNA and GFP.

FIG. 12. Flow Cytometry Analysis of siRNA Transfection. LEFT:Lipofectamine. Each column corresponds to experiments performed withdifferent transfection methods: lipofectamine (blue); and 20 nM+36 GFP(red). Each chart corresponds to experiments performed with differentcell types: IMCD cells, PCl2 cells, HeLa cells, 3T3L cells, and Jurkatcells. The X-axis represents measurements obtained from the Cy3 channel,which is a readout of siRNA fluorescence. The Y-axis represents cellcount in flow cytometry experiments. Flow cytometry data indicate thatcells were more efficiently transfected with siRNA using +36 GFP thanLipofectamine.

FIG. 13. siRNA Delivered with +36 GFP Can Induce Gene Knockdown. 50 nMGAPDH siRNA was transfected into five different cell types (HeLa, IMCD,3T3L, PC12, and Jurkat cell lines) using either ˜2 μM lipofectamine 2000(black bars) or 20 nM+36 GFP (green bars). The Y-axis represents GAPDHprotein levels as a fraction of tubulin protein levels.

FIG. 14. Mechanistic Probes of Cell Penetration. HeLa cells were treatedwith one of a variety of probes for 30 minutes and were then treatedwith 5 nM+36 GFP. Samples included: (A) no probe; (B) 4° C.preincubation (inhibits energy-dependent processes); (C) 100 mM sucrose(inhibits clathrin-mediated endocytosis), left, and 25 μg/ml nystatin(disrupts caveolar function), right; (D) 25 μM cytochalisin B (inhibitsmacropinocytosis), left, and 5 μM monensin (inhibits endosome receptorrecycling), right.

FIG. 15. Factors Contributing to Cell-Penetrating Activity. Chargemagnitude was shown to contribute to cell-penetrating activity. Inparticular, +15 GFP or Lys₂₀₋₅₀ was shown not to penetrate cells. Left:20 mM+15 GFP and 50 nM siRNA-Cy3. Middle: 20 nM+36 GFP. Right: 60 nMLys₂₀₋₅₀ and 50 nM siRNA-Cy3. Hoechst channel, blue, was used tovisualize DNA, thereby marking the position of cells; GFP channel,green, was used to visualize GFP.

FIG. 16. Supercharged GFP variants and their ability to penetrate cells.(A) Calculated electrostatic surface potential of GFP variants, coloredfrom −25 kT/e (dark red) to +25 kT/e (dark blue). (B) Flow cytometryanalysis showing amounts of internalized GFP in HeLa cells independentlytreated with 200 nM of each GFP variant and washed three times with PBScontaining heparin to remove cell surface-bound GFP. (C) Flow cytometryanalysis showing amounts of internalized +36 GFP (green) in HeLa, IMCD,3T3-L, PC12, and Jurkat cells compared to background fluorescence inuntreated cells (black).

FIG. 17. (A) Internalization of +36 GFP in HeLa cells afterco-incubation for 1 hour at 37 C. (B) Inhibition of +36 GFP cellpenetration in HeLa cells incubated at 4° C. for 1 hour. Cells were onlypartially washed to enable +36 GFP to remain partially bound to the cellsurface. (C) and (D) +36 GFP internalization under the conditions in (A)but in the presence of caveolin-dependent endocytosis inhibitors filipinand nystatin, respectively. (E) +36 GFP internalization under theconditions in (A) but in the presence of the clathrin-dependentendocytosis inhibitor chlorpromazine. (F) Cellular localization of AlexaFluor 647-labeled transferrin (red) and +36 GFP (green) 20 minutes afterendocytosis. (G) Inhibition of +36 GFP internalization in HeLa cells inthe presence of the actin polymerization inhibitor cytochalasin D. (H)Inhibition of +36 GFP internalization in HeLa cells treated with 80 mMsodium chlorate. (I) Internalization of +36 GFP in CHO cells incubatedat 37° C. for 1 hour. (J) Lack of +36 GFP internalization in PDG-CHOcells. In (I) and (J) cell nuclei were stained with DAPI (blue).

FIG. 18. (A) Gel-shift assay showing unbound siRNA (33) stained byethidium bromide to determine superpositive GFP:siRNA bindingstoichiometry. 10 pmoles of siRNA was mixed with various molar ratios ofeach GFP for 10 minutes at 25° C., then analyzed by non-denaturing PAGE.The rightmost lane in each row shows a 100:1 mixture of sfGFP and siRNA.(B) Flow cytometry analysis showing levels of internalized siRNA in HeLacells treated with a mixture of 50 nM Cy3-siRNA and 200 nM of +15, +25,or +36 GFP, followed by three heparin washes to remove non-internalizedprotein (see FIG. 22). Data from HeLa cells treated with siRNA but notransfection reagent is shown in black. (C) Flow cytometry analysisshowing levels of Cy3-labeled siRNA delivered into HeLa, IMCD, 3T3-L,PC12, and Jurkat cells after incubation with a mixture of 50 nMCy3-siRNA and either 200 nM+36 GFP (green) or ˜2 μM Lipofectamine 2000(blue) in comparison to cells treated with siRNA without transfectionreagent (black). Cells were washed before flow cytometry as describedabove. (D) Fluorescence microscopy images of stably adherent cell lines(HeLa, IMCD, and 3T3-L) 24 hours after a 4-hour treatment with 200 nM+36GFP and 50 nM Cy3-siRNA. Each image is an overlay of three channels:blue (DAN stain), red (Cy3-siRNA), and green (+36 GFP); yellow indicatesthe colocalization of red and green. Magnification for all three imageswas 40×.

FIG. 19. Suppression of GAPDH mRNA and protein levels resulting fromsiRNA delivery. (A) GAPDH mRNA level suppression in HeLa cells 48, 72,or 96 hours after treatment with 50 nM siRNA and ˜2 μM Lipofectamine2000, or with 50 nM siRNA and 200 nM+36 GFP, as measured by RT-QPCR.Suppression levels shown are normalized to β-actin mRNA levels; 0%suppression is defined as the mRNA level in cells treated with ˜2 μMLipofectamine 2000 and 50 nM scrambled negative control siRNA. (B) GAPDHprotein level suppression in HeLa cells 48, 72, and 96 hours aftertreatment with siRNA and ˜2 μM Lipofectamine 2000, or with siRNA and 200nM+36 GFP. (C) GAPDH protein level suppression in HeLa, IMCD, 3T3-L,PC12, and Jurkat cells 96 hours after treatment with 50 nM siRNA and ˜2μM Lipofectamine 2000, 200 nM+36 GFP, or 200 nM+36 GFP-HA2. For (B) and(C), suppression levels shown are measured by Western blot and arenormalized to β-tubulin protein levels; 0% suppression is defined as theprotein level in cells treated with ˜2 μM Lipofectamine 2000 and ascrambled negative control siRNA. Values and error bars represent themean and the standard deviation of three independent experiments in (A)and (B) and five independent experiments in (C).

FIG. 20. The siRNA transfection activities of a variety of cationicsynthetic peptides compared with that of +15 and +36 GFP. Flow cytometrywas used to measure the levels of internalized Cy3-siRNA in HeLa cellstreated for 4 hours with a mixture of 50 nM Cy3-siRNA and either 200 nMor 2 μM of the peptide or protein shown.

FIG. 21. Plasmid DNA transfection into HeLa, IMCD, 3T3-L, PC 12, andJurkat cells by Lipofectamine 2000, +36 GFP, or +36 GFP-HA2. Cells weretreated with 800 ng pSV-β-galactosidase plasmid and 200 nM or 2 μM of+36 GFP or +36 GFP-HA2 for 4 hours. After 24 hours, β-galactosidaseactivity was measured using the β-Fluor kit (Novagen). Values and errorbars represent the mean and standard deviation of three independentexperiments.

FIG. 22. The effectiveness of the washing protocol used to remove cellsurface-bound supercharged GFP. HeLa cells were treated with 200 nM+36GFP at 4° C. (to block cell uptake of GFP, see the main text) for 1hour. Cells were then washed three times (1 minute for each wash) with4° C. PBS or with 4° C. 20 U/mL heparin sulfate in PBS, then analyzed byflow cytometry. Cells washed with PBS show significant GFP fluorescencepresumably arising from cell-surface bound GFP. In contrast, cellswashed with 20 U/mL heparin in PBS exhibit GFP fluorescence levelsequivalent to untreated cells.

FIG. 23. Concentration dependence of +36 GFP cell penetration in HeLacells. HeLa cells were treated with +36 GFP in serum-free media for 4hours. Cells were trypsinized and replated in 10% FBS in DMEM on glassslides coated with Matrigel (BD Biosciences). After 24 hours at 37° C.,cells were fixed with 4% formaldehyde in PBS, stained with DAPI, andimaged using a Leica DMRB inverted microscope. Magnification for allimages is 20×.

FIG. 24. Fluorescence microscopy reveals no internalized Cy3-siRNA inIMCD and 3T3-L cells using Fugene 6 (Roche) transfection agent. Cellswere treated with Fugene 6 in serum-free media for 4 hours following themanufacturer's protocol. Cells were trypsinized and pelleted. Thetrypsin-containing media was removed by aspiration and the cells wereresuspended in 10% FBS in DMEM then plated on glass slides precoatedwith Matrigel™. Cells were allowed to adhere for 24 hours, fixed with 4%formaldehyde in PBS, stained with DAPI, and imaged using a Leica DMRBinverted microscope. Magnification for all images is 20×. No Cy3fluorescence was observed (compare with FIG. 18D).

FIG. 25. (A) MTT cytotoxicity assay for five mammalian cell linestreated with 50 nM siRNA and ˜2 μM Lipofectamine 2000, +36 GFP, or +36GFP-HA2. Data were taken 24 hours after treatment. Values and error barsreflect the mean and the standard deviation of three independentexperiments. Cells treated with +36 GFP or +36 GFP-HA2 but without theMTT reagent did not exhibit significant absorbance under theseconditions. (B) MTT cytotoxicity assay of HeLa cells treated with 50 nMsiRNA and either 200 nM or 2 μM cationic polymer. Treatment withchloroquine or pyrene butyric acid proved cytotoxic (lanes 9 and 10,respectively).

FIG. 26. Gel-shift assay showing unbound linearized pSV-β-galactosidaseplasmid DNA (Promega) to determine +36 GFP:plasmid DNA bindingstoichiometry. In each lane 22 fmol of pSV-β-galactosidase linearized byEcoRI digestion was combined with various molar ratios of +36 GFP andincubated at 25° C. for 10 minutes. Samples were analyzed byelectrophoresis at 140 V for 50 minutes on a 1% agarose gel containingethidium bromide.

FIG. 27. SDS-PAGE analysis of purified GFP variants used in this work.The proteins were visualized by staining with Coomassie Blue. Themigration points of molecular weight markers are listed on the left.Note that supercharged GFP migrates during SDS-PAGE in a manner that ispartially dependent on theoretical net charge magnitude, rather thansolely on actual molecular weight.

FIG. 28. Fluorescence spectra of all GFP analogs used in this study (10nM each protein, excitation at 488 nm).

FIG. 29. (A) Representative Western blot data 4 days after treatmentwith ˜2 μM Lipofectamine 2000 and 50 nM negative control siRNA. (B)Representative Western blot data 4 days after treatment with 200 nM+36GFP and 50 nM negative control siRNA. (C) Representative Western blotdata showing GAPDH and β-tubulin levels 48, 72, and 96 hours aftertreatment with 50 nM GAPDH siRNA and either ˜2 μM Lipofectamine 2000 or200 nM+36 GFP. (D) Representative Western blot data 4 days aftertreatment with ˜2 μM Lipofectamine 2000 and 50 nM GAPDH siRNA. (E)Representative Western blot data 4 days after treatment with 200 nM+36GFP and 50 nM GAPDH siRNA. (F) Representative Western blot data 4 daysafter treatment with 200 nM+36 GFP-HA2 and 50 nM GAPDH siRNA. (G)Representative western blot data from HeLa cells four days aftertreatment with ˜2 μM Lipofectamine 2000 and 50 nM negative controlsiRNA, Lipofectamine 2000 and 50 nM β-actin targeting siRNA, 200 nM+36GFP and 50 nM β-actin targeting siRNA, or 200 nM+36 GFP and 50 nMnegative control siRNA.

FIG. 30. Fluorescence microscopy reveals no internalized Cy3-siRNA orGFP in HeLa cells treated at either 4° C., or in HeLa cells pretreatedwith cytochalisin D (10 μg/mL). Image is of cells 1 hour after treatmentwith a solution containing 200 nM+36 GFP and 50 nM siRNA. Images weretaken on an inverted spinning disk confocal microscope equipped with afilter to detect GFP emission. To facilitate visualization, cells werewashed twice (one minute each) with 20 U/mL heparin in PBS to removemost (but not all) surface bound GFP-siRNA.

FIG. 31. (A) Dynamic Light Scattering (DLS) data showing thehydrodynamic radius (Hr) of particles formed from mixing 20 μM+36 GFPand 5 μM of a double-stranded RNA 20-mer. (B) Fluorescence microscopyimage of the above sample. The image shown is an overlay of brightfieldand GFP channel images; note that the larger features are actuallysmaller particles associated together as the sample dried. Scale bar=10μm.

FIG. 32. (A) Digestion of +36 GFP and bovine serum albumin by proteinaseK. 100 pmol of +36 GFP or bovine serum albumin (BSA) was treated with0.6 units of proteinase K at 37° C. Samples were mixed with SDS proteinloading buffer, heated to 90° C. for 10 minutes, and analyzed bySDS-PAGE on a 4-12% acrylamide gel staining with Coomassie Blue. (B)Stability of +36 GFP and BSA in murine serum. 100 μmol of each proteinin PBS was mixed with 5 μL of murine serum to a total volume of 10 μLand incubated at 37° C. Samples were mixed with SDS protein loadingbuffer and heated to 90° C. for 10 minutes. The resulting mixture wasanalyzed by SDS-PAGE on a 4-12% acrylamide gel and the +36 GFP and BSAprotein bands were revealed by Western blot. The bottom image is 5 μL ofsample of +36 GFP-siRNA complexes (discussed in C) and analyzed for GFPby Western blot. (C) Stability of siRNA complexed with +36 GFP in murineserum. siRNA (10 pmol) was mixed with sfGFP (40 pmol) or +36 GFP (40pmol), and incubated in 4 μL of PBS for 10 minutes at 25° C. Theresulting solution was added to four volumes of mouse serum (20 μLtotal) and incubated at 37° C. for the indicated times, precipitatedwith ethanol, and analyzed by gel electrophoresis on a 15% acrylamidegel. (D) Stability of plasmid DNA complexed with +36 GFP or sfGFP inmurine serum. Plasmid DNA (0.026 pmol) was mixed with 12.8 pmol ofeither +36 GFP or sfGFP in 4 μL of PBS for 10 minutes. To this solutionwas added 16 μL of mouse serum (20 μL total). Samples were incubated at37° C. for the indicated times. DNA was isolated by extraction withphenol-chloroform and precipitation with ethanol, then analyzed by gelelectrophoresis on a 1% agarose gel.

FIG. 33. A: Internalization of mCherry using (1) mCherry-TAT; (2)mCherry-Arg₉; and (3) mCherry-ALAL-+36 GFP in HeLa, PC12, and IMCD celllines. Flow cytometry of HeLa, P12 and IMCD (inner medullary collectingduct) cells incubated in the presence of the specified concentrations ofTat-mCherry, Arg9-mCherry, +36 GFP-mCherry, or wild-type mCherry alonefor 4 hours at 37° C. Cells were washed three times with 20 U/mL heparinin PBS to remove membrane-bound protein before analysis. B:Membrane-bound protein is removed by heparin washing conditions. (a)Live-cell fluorescence microscopy indicates that at 4° C.+36 GFP-mCherryis membrane-bound but not internalized. After washing with heparin (butnot after washing with PBS), this +36 GFP-mCherry signal is largelyremoved. At 37° C., most of +36 GFP-mCherry signal remains even afterheparin washing, consistent with internalization of +36 GFP-mCherry. (b)HeLa and PC12 cells subjected to the conditions described in (a) weretrypsinized (which destroys surface-bound mCherry) then analyzed by flowcytometry. Cells incubated with +36 GFP-mCherry at 4° C. do not showsignificant mCherry fluorescence compared to cells incubated at 37° C.,further suggesting that the signal at 37° C. represents internalizedprotein signal, and that internalization at 4° C. is inefficient.

FIG. 34. Fluorescence microscopy images of HeLa, PC12, and IMCD cellsfour hours after treatment with 50 nM mCherry-ALAL-+36 GFP. Each imageis an overlay of three channels: blue (DAPI stain for DNA), red(mCherry), and green (+36 GFP). Yellow indicates colocalization of redand green.

FIG. 35. Human proteins deliver siRNA to HeLa cells. (A) Human proteinswere mixed at increasing mass ratios with siRNA and assayed for unboundsiRNA by PAGE and ethidium bromide staining. Decreasing band intensitiesdemonstrate siRNA binding by human proteins. (B) Human proteins weremixed with Cy3-labelled siRNA and applied to HeLa cells for four hours.Cells were then washed and assayed for Cy3 fluorescence by flowcytometry. A shift of the peak to the right demonstrates siRNAinternalization. (C) HeLa cells were transfected with siRNA using humanproteins, incubated for three days, and assayed for degradation of atargeted mRNA. Targeted GAPDH mRNA levels were compared relative toβ-actin mRNA levels. “Control” indicates use of a non-targeting siRNA.Lipofectamine 2000 was used as positive control.

FIG. 36. Mammalian cell penetration of +36 GFP protein fusions. (a) +36GFP fusion architectures. (b) Flow cytometry of HeLa cells incubated atthe concentrations shown in the presence of +36 GFP fusions for 4 hoursat 37° C. Cells were washed three times with 20 U/mL heparin in PBS toremove membrane-bound protein prior to analysis. Untreated cellsresulted in median GFP fluorescence values of 107±5. Error barsrepresent the range of values of two independent biological replicates.(c) Flow cytometry of HeLa cells incubated in the presence of 100 nM ofeach +36 GFP fusion at 37° C. for the specified time. Untreated cellsresulted in median GFP fluorescence values of 100±6. Error barsrepresent the range of values of two independent biological replicates.(d) Fluorescence microscopy of HeLa cells incubated in the presence of+36 GFP fusions at 100 nM for 30 min at 37° C. Cell nuclei were stainedwith DAPI (blue). The scale bar represents 15 μm.

FIG. 37. A: Deubiquitination suggests cytosolic exposure of aubiquitin-+36 GFP fusion protein. Shown are Western blots using anti-His(green) and anti-GFP (red) antibodies. All proteins carry an N-terminal6×His tag on the ubiquitin moiety. (a) Lanes 1 and 2: purified proteinsamples of wild-type ubiquitin-+36 GFP (wt) or G76V mutant ubiquitin-+36GFP (G76V). Lanes 3 and 4: purified protein spiked into HeLa cell lysateto check the possible effect of lysis conditions on fusion proteinintegrity. Lanes 5 and 6: HeLa cell lysates treated with 200 nM ofeither the wt or G76V ubiquitin-+36 GFP for 1 hour. (b) Effect ofchloroquine on ubiquitin-+36 GFP deubiquitination. Cells were treatedwith 200 nM of wt or G76V ubiquitin-+36 GFP, either in the presence orabsence of 200 μM chloroquine for 1 hour. (c) In vitro deubiquitinationassay. Ubiquitin-+36 GFP fusion proteins were incubated either in HeLacytosolic extract or in HeLa cytosolic extract containing 10 mM of theDUB inhibitor N-ethylmaleimide (NEM) for 30 minutes at 37° C. B: (a)Western blots using anti-GFP antibodies. Lanes 1-3: purified proteinsamples of +36 GFP, wild-type ubiquitin-+36 GFP fusion (wt) or G76Vmutant ubiquitin-+36 GFP fusion (mut). Lanes 4 and 5: purified proteinspiked into HeLa cell lysate to confirm that lysis conditions do notaffect fusion protein integrity. Lanes 6-11: the indicated cells weretreated with 100 nM of either the wt or mutant ubiquitin-+36 GFP for 1hour, then lysed. (b) Mean extent of deubiquitination of wtubiquitin-+36 GFP fusion protein in HeLa, 3T3, and BSR cells. Error barsreflect the standard deviation of three independent biologicalreplicates. (c) In vitro deubiquitination control experiment.Ubiquitin-+36 GFP fusion proteins were incubated in either HeLacytosolic extract or in HeLa cytosolic extract containing one of two DUBinhibitors, 10 mM N-ethylmaleimide (NEM) or 20 μg/mL ubiquitinaldehyde(Ub-Al) for 1 hour at 37° C.

FIG. 38. Comparison of delivery of Cre recombinase to mammalian cellsusing Tat, Arg₉, or +36 GFP fusions. (a) Cathepsin B-mediated cleavageof +36 GFP-Cre. Lane 1: +36 GFP; lane 2: Cre; lane 3: +36 GFP Crefusion; lane 4: +36 GFP-Cre after incubation with cathepsin B. (b) Invitro Cre activity assay. Lane 1: pCALNL alone; lane 2: pCALNL afterincubation with 100 nM Cre for 30 minutes at 37° C.; lane 3: identicalto lane 2, but with +36 GFP-Cre; lane 4: identical to lane 2 but with+36 GFP-Cre pre-treated with cathepsin B; lane 5: identical to lane 2but with cathepsin B. (c) Cre-mediated recombination frequency in HeLacells transiently transfected with pCALNL and treated with +36 GFP-Cre,Tat-Cre, or Arg₉-Cre. The picture is an overlay of DsRed2 signal andbrightfield images of HeLa cells transfected with pCALNL-DsRed2 andtreated with 100 nM+36 GFP-Cre. (d) Cre-mediated recombination frequencyin 3T3.loxP.lacZ cells treated with +36 GFP-Cre, Tat-Cre, or Arg₉-Cre.The picture is of 3T3.loxP.lacZ cells treated with 500 nM+36 GFP-Cre andstained with X-Gal. (e) Cre mediated recombination frequency inmES.LSL.mCherry cells treated with 1 μM+36 GFP-Cre, Tat-Cre, orArg₉-Cre. The pictures are of cells 24 hours after treatment (left) andof replated recombinant mES cells allowed to regrow into colonies(right). Error bars reflect the standard error of either five (c and d)or three (e) independent biological replicates.

FIG. 39. SDS-PAGE analysis of +36 GFP fusion proteins afterpurification. Lane 1: +36 GFP; lane 2: +36 GFP-mCherry; lane 3: +36GFP-Cre; lane 4: Ubiquitin-+36 GFP. The proteins on the gel were stainedwith Coomassie Blue.

FIG. 40. Fluorescence spectra of +36 GFP fusion proteins. (a) Absorbancespectra of +36 GFP fusions. (b) Excitation spectra of +36 GFP fusions;emission wavelength=515 nm. (c) Emission spectra of +36 GFP fusions;excitation wavelength=488 nm. (d) Cleavage of +36 GFP-mCherry fusion.Lane 1: +36 GFP; lane 2: mCherry; lane 3: +36 GFP-mCherry fusion; lane4: +36 GFP-mCherry fusion after treatment with 500 ng cathepsin B for 45min at 37° C. (e) Excitation spectrum of +36 GFP-mCherry afterincubation with or without cathepsin B; emission wavelength=515 nm. Theinset absorbance spectrum shows that the absorbance level of the GFPfluorophore in both samples is equivalent. (f) Emission spectrum of +36GFP-mCherry incubated with or without cathepsin B; excitationwavelength=488 nm. Spectra were obtained on a Tecan Safire II with 5 nmbandwidth filters in a 96-well glass-bottom white wall Costar plate.

FIG. 41. Protein fusions with +36 GFP exhibit no significant celltoxicity. A: HeLa cells were plated at 50,000 cells per well in a24-well plate. After 12 hours, cells were incubated with 2 μM of theindicated protein for four hours in serum-free DMEM. Cells were washedthree times with heparin and incubated in complete DMEM for 24 hours,then subjected to an MTT assay (Sigma). Values represent the average oftwo independent experiments each performed in duplicate normalized tountreated cells. Error bars indicate the range in values. B: +36 GFP and+36 GFP fusions are not toxic at concentrations effective for proteindelivery. At concentrations ≧˜10 to 100 times the effectiveconcentration for protein delivery in this work, +36 GFP-Cre (but notother +36 GFP fusion proteins or +36 GFP itself) reduced the viabilityof some cell lines and possibly stimulated IMCD cells. Values and errorbars represent the average of and standard deviation, respectively, ofthree independent biological replicates.

FIG. 42. Localization of internalized mCherry. HeLa cells incubated with100 nM+36 GFP mCherry for four hours were trypsinized and platedtogether with untreated HeLa cells. As imaged by live-cell widefieldfluorescence, a diffuse red signal (mCherry) was observed in thecytoplasm of the GFP-containing cells, while adjacent GFP negative cellsdid not have a diffuse red signal. +36 GFP (green signal) appears toremain as puncti.

FIG. 43. Tat-Cre and Cre-Arg fusions retain Cre recombinase activity invitro. pCALNL-DsRed2 (Addgene: 13769) contains a 1.2 kB region flankedby parallel loxP recognition sites. 500 ng of pCALNL-DsRed2 linearizedby PvuI was incubated with 1 picomole of the listed protein in 50 μL of50 mM Tris pH 7.5, 33 mM NaCl, 10 mM MgCl at 37° C. for 30 min, then at70° C. for 10 min. DNA was isolated from the reaction by QIAquick spincolumn (Qiagen) and analyzed by electrophoresis on a 1% agarose gel. Thegel was stained with ethidium bromide for 30 minutes and recombinationproducts were visualized by ultraviolet light.

FIG. 44. A: +36 GFP potently delivers fused proteins into mammaliancells. Cell-penetration potency of +36 GFP exceeds that of knowncell-penetrating peptides and proteins, especially at lowconcentrations. B: Comparison of mCherry delivery by +36 GFP, Tat,Arg10, and penetratin. (a) Flow cytometry of HeLa, BSR, 3T3, PC12 andIMCD cells incubated in the presence of the specified concentrations of+36 GFP-mCherry, Tat-mCherry, Arg10-mCherry, penetratin-mCherry orwild-type mCherry alone for 4 hours at 37° C. Cells were washed threetimes with 20 U/mL heparin in PBS to remove membrane-bound proteinbefore analysis. Error bars represent the standard error of threeindependent biological replicates. (b) Confocal fluorescence microscopyof live cells incubated with 100 nM+36 GFP-mCherry for 4 hours at 37° C.Red color represents mCherry signal; green color represents +36 GFPsignal. The scale bar is 15 μm.

FIG. 45. Functional, nuclearly localized protein delivered by +36 GFP.In HeLa and NIH-3T3 cells, Cre delivery is more effective with +36 GFPthan with known cell-penetrating peptides and proteins.

FIG. 46. Delivery of proteins non-covalently associated withsupercharged protein. +52 streptavidin delivers biotinylated smallmolecules and biotinylated protein into cells.

FIG. 47. Net charge magnitude, distribution, and structure asdeterminants of cell-penetration potency.

FIG. 48. Supercharged proteins encoded in the human genome.

FIG. 49. Naturally occurring human supercharged proteins can deliversiRNA into cells.

FIG. 50. Naturally occurring human proteins can deliver functional Crerecombinase.

FIG. 51. In vivo siRNA delivery by +36 GFP.

FIG. 52. In vivo: delivery of functional Cre recombinase by +36 GFP.Upper panel: fluorescence microscopy of a retinal section of a CD1 adultmouse injected with 0.5 μL of 100 μM+36 GFP. The retina was harvestedand analyzed six hours after injection. GFP fluorescence is shown ingreen and DAPI nuclear stain is shown in blue. Red fluorescence on theright image indicated recombination of a RFP-loxP reporter construct byCre. Lower panel: X-gal staining of p0 mouse pup retinas harboring aloxP-LacZ reporter treated ex vivo.

FIG. 53. In vivo delivery of functional Cre recombinase by +36 GFP.Retinal sections of neonatal RC::PFwe mouse pups harboring a nuclearLacZ reporter of Cre activity. Three days after injection of 0.5 μL of40 μM wild-type Cre, Tat-Cre, or +36 GFP-Cre, retinae were harvested,fixed, and stained with X-gal. Dots on the lower right graph representthe total number of recombined cells counted in each retina. Thehorizontal bar represents the average number of recombined cells perretina for each protein injected (n=4 for wild-type Cre, n=6 forTat-Cre, n=6 for +36 GFP-Cre).

FIG. 54. Peptide fusion strategies for improvement of endosomal escape.

DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS OF THE INVENTION

The present invention provides compositions, preparations, systems, andrelated methods for enhancing delivery of a protein or other agent tocells by supercharging the protein itself or by associating the proteinor other agent (e.g., peptides, proteins, small molecules) with asupercharged protein. Such systems and methods generally comprise theuse of supercharged proteins. In some embodiments, the superchargedprotein itself is delivered to the interior of a cell, e.g., to cause abiological effect on the cell into which it penetrates for therapeuticbenefit. Supercharged proteins can also be used to deliver other agents.For example, superpositively charged proteins may be associated withagents having a negative charge, e.g., nucleic acids (which typicallyhave a net negative charge) or negatively charged peptides or proteinsvia electrostatic interactions to form complexes. Supernegativelycharged proteins may be associated with agents having a positive charge.Agents to be delivered may also be associated with the superchargedprotein through covalent linkages or other non-covalent interactions. Insome embodiments, such compositions, preparations, systems, and methodsinvolve altering the primary sequence of a protein in order to“supercharge” the protein (e.g., to generate a superpositively-chargedprotein). In certain embodiments, the inventive system uses a naturallyoccurring protein to form a complex. In certain embodiments, theinventive complex comprises a supercharged protein and one or moreagents to be delivered (e.g., nucleic acid, protein, peptide, smallmolecule). In one example of cellular uptake, supercharged proteins havebeen found to be endocytosed by cells. The supercharged protein, or thesupercharged protein mixed with an agent to be delivered to form aprotein/agent complex, is effectively transfected into the cell.Mechanistic studies indicate the endocytosis of these complexes involvessulfated cell surface proteoglycans but does not involve clathrin orcaveolin. In some embodiments, supercharged protein or complexescomprising supercharged proteins and one or more agents to be deliveredare useful as therapeutic agents, diagnostic agents, or research tools.In some embodiments, an agent and/or supercharged protein may betherapeutically active. In some embodiments, a supercharged protein orcomplex is used to modulate the expression of a gene in a cell. In someembodiments, a supercharged protein or complex is used to modulate abiological pathway (e.g., a signaling pathway, a metabolic pathway) in acell. In some embodiments, a supercharged protein or complex is used toinhibit the activity of an enzyme in a cell. In some embodiments,inventive supercharged proteins or complexes and/or pharmaceuticalcompositions thereof are administered to a subject in need thereof. Insome embodiments, inventive supercharged proteins or complexes and/orcompositions thereof are contacted with a cell under conditionseffective to transfect the agent into a cell (e.g., human cells,mammalian cells, T-cells, neurons, stem cells, progenitor cells, bloodcells, fibroblasts, epithelial cells, etc.). In some embodiments,delivery of a supercharged protein or complex to cells involvesadministering a supercharged protein or a complex comprisingsupercharged proteins associated with therapeutic agents to a subject inneed thereof.

Supercharged Proteins

Supercharged proteins can be produced by changing non-conserved aminoacids on the surface of a protein to more polar or charged amino acidresidues. The amino acid residues to be modified may be hydrophobic,hydrophilic, charged, or a combination thereof. Supercharged proteinscan also be produced by the attachment of charged moieties to theprotein in order to supercharge the protein. Supercharged proteinsfrequently are resistant to aggregation, have an increased ability torefold, resist improper folding, have improved solubility, and aregenerally more stable under a wide range of conditions, includingdenaturing conditions such as heat or the presence of a detergent.

Any protein may be modified using the inventive system to produce asupercharged protein. Natural as well as unnatural proteins (e.g.,engineered proteins) may be modified. Example of proteins that may bemodified include receptors, membrane bound proteins, transmembraneproteins, enzymes, transcription factors, extracellular proteins,therapeutic proteins, cytokines, messenger proteins, DNA-bindingproteins, RNA-binding proteins, proteins involved in signaltransduction, structural proteins, cytoplasmic proteins, nuclearproteins, hydrophobic proteins, hydrophilic proteins, etc. A protein tobe modified may be derived from any species of plant, animal, and/ormicroorganism. In certain embodiments, the protein is a mammalianprotein. In certain embodiments, the protein is a human protein. Incertain embodiments, the protein is derived from an organism typicallyused in research. For example, the protein to be modified may be from aprimate (e.g., ape, monkey), rodent (e.g., rabbit, hamster, gerbil),pig, dog, cat, fish (e.g., Danio rerio), nematode (e.g., C. elegans),yeast (e.g., Saccharomyces cervisiae), or bacteria (e.g., E. coli). Incertain embodiments, the protein is non-immunogenic. In certainembodiments, the protein is non-antigenic. In certain embodiments, theprotein does not have inherent biological activity or has been modifiedto have no biological activity. In certain embodiments, the protein ischosen based on its targeting ability. In certain embodiments, theprotein is green fluorescent protein.

In some embodiments, the protein to be modified is one whose structurehas been characterized, for example, by NMR or X-ray crystallography. Insome embodiments, the protein to be modified is one whose structure hasbeen correlated and/or related to biochemical activity (e.g., enzymaticactivity, protein-protein interactions, etc.). In some embodiments, suchinformation provides guidance for selection of amino acid residues to bemodified or not modified (e.g., so that biological function ismaintained or so that biological activity can be reduced or eliminated).In certain embodiments, the inherent biological activity of the proteinis reduced or eliminated to reduce the risk of deleterious and/orundesired effects.

In some embodiments, the protein to be modified is one that is useful inthe delivery of a nucleic acid or other agent to a cell. In someembodiments, the protein to be modified is an imaging, labeling,diagnostic, prophylactic, or therapeutic agent. In some embodiments, theprotein to be modified is one that is useful for delivering an agent,e.g., a nucleic acid, to a particular cell. In some embodiments, theprotein to be modified is one that has desired biological activity. Insome embodiments, the protein to be modified is one that has desiredtargeting activity. In some embodiments, non-conserved surface residuesof a protein of interest are identified and at least some of themreplaced with a residue that is hydrophilic, polar, and/or charged atphysiological pH. In some embodiments, non-conserved surface residues ofa protein of interest are identified and at least some of them replacedwith a residue that is positively charged at physiological pH.

The surface residues of the protein to be modified are identified usingany method(s) known in the art. In certain embodiments, surface residuesare identified by computer modeling of the protein. In certainembodiments, the three-dimensional structure of the protein is knownand/or determined, and surface residues are identified by visualizingthe structure of the protein. In some embodiments, surface residues arepredicted using computer software. In certain particular embodiments, anAverage Neighbor Atoms per Sidechain Atom (AvNAPSA) value is used topredict surface exposure. AvNAPSA is an automated measure of surfaceexposure which has been implemented as a computer program. A low AvNAPSAvalue indicates a surface exposed residue, whereas a high valueindicates a residue in the interior of the protein. In certainembodiments, the software is used to predict the secondary structureand/or tertiary structure of a protein, and surface residues areidentified based on this prediction. In some embodiments, the predictionof surface residues is based on hydrophobicity and hydrophilicity of theresidues and their clustering in the primary sequence of the protein.Besides in silico methods, surface residues of the protein may also beidentified using various biochemical techniques, for example, proteasecleavage, surface modification, etc.

Optionally, of the surface residues, it is then determined which areconserved or important to the functioning of the protein. The step ofdetermining which residues are conserved is optional when it is notnecessary to preserve the underlying biological activity of the protein.Identification of conserved residues can be determined using any methodknown in the art. In certain embodiments, conserved residues areidentified by aligning the primary sequence of the protein of interestwith related proteins. These related proteins may be from the samefamily of proteins. For example, if the protein is an immunoglobulin,other immunoglobulin sequences may be used. Related proteins may also bethe same protein from a different species. For example, conservedresidues may be identified by aligning the sequences of the same proteinfrom different species. To give but another example, proteins of similarfunction or biological activity may be aligned. Preferably, 2, 3, 4, 5,6, 7, 8, 9, or 10 different sequences are used to determine theconserved amino acids in the protein. In certain embodiments, a residueis considered conserved if over 50%, over 60%, over 70%, over 75%, over80%, over 90%, or over 95% of the sequences have the same amino acid ina particular position. In other embodiments, the residue is consideredconserved if over 50%, over 60%, over 70%, over 75%, over 80%, over 90%,or over 95% of the sequences have the same or a similar (e.g., valine,leucine, and isoleucine; glycine and alanine; glutamine and asparagine;or aspartate and glutamate) amino acid in a particular position. Manysoftware packages are available for aligning and comparing proteinsequences as described herein. As would be appreciated by one of skillin the art, either the conserved residues may be determined first or thesurface residues may be determined first. The order does not matter. Incertain embodiments, a computer software package may determine surfaceresidues and conserved residues simultaneously. Important residues inthe protein may also be identified by mutagenesis of the protein. Forexample, alanine scanning of the protein can be used to determine theimportant amino acid residues in the protein. In some embodiments,site-directed mutagenesis may be used. In certain embodiments,conserving the original biological activity of the protein is notimportant, and therefore, the steps of identifying the conservedresidues and preserving them in the supercharged protein are notperformed.

Each of the surface residues is identified as hydrophobic orhydrophilic. In certain embodiments, residues are assigned ahydrophobicity score. For example, each surface residue may be assignedan octanol/water logP value. Other hydrophobicity parameters may also beused. Such scales for amino acids have been discussed in: Janin, 1979,Nature, 277:491; Wolfenden et al., 1981, Biochemistry, 20:849; Kyte etal., 1982, J. Mol. Biol., 157:105; Rose et al., 1985, Science, 229:834;Cornette et al., 1987, J. Mol. Biol., 195:659; Charton and Charton,1982, J. Theor. Biol., 99:629; each of which is incorporated byreference. Any of these hydrophobicity parameters may be used in theinventive method to determine which residues to modify. In certainembodiments, hydrophilic or charged residues are identified formodification.

At least one identified surface residue is then chosen for modification.In certain embodiments, hydrophobic residue(s) are chosen formodification. In other embodiments, hydrophilic and/or chargedresidue(s) are chosen for modification. In certain embodiments, morethan one residue is chosen for modification. In certain embodiments, 1,2, 3, 4, 5, 6, 7, 8, 9, or 10 of the identified residues are chosen formodification. In certain embodiments, over 10, over 15, over 20, or over25 residues are chosen for modification. As would be appreciated by oneof skill in the art, the larger the protein, the more residues that willneed to be modified. Also, the more hydrophobic or susceptible toaggregation or precipitation the protein is, the more residues may needto be modified. In certain embodiments, multiple variants of a protein,each with different modifications, are produced and tested to determinethe best variant in terms of delivery of a nucleic acid to a cell,stability, biocompatibility, and/or biological activity.

In certain embodiments, residues chosen for modification are mutatedinto more hydrophilic residues (including charged residues). Typically,residues are mutated into more hydrophilic natural amino acids. Incertain embodiments, residues are mutated into amino acids that arecharged at physiological pH. For example, a residue may be changed to anarginine, aspartate, glutamate, histidine, or lysine. In certainembodiments, all the residues to be modified are changed into the samedifferent residue. For example, all the chosen residues are changed to alysine residue. In other embodiments, the chosen residues are changedinto different residues; however, all the final residues may be eitherpositively charged or negatively charged at physiological pH. In certainembodiments, to create a negatively charged protein, all the residues tobe mutated are converted to glutamate and/or aspartate residues. Incertain embodiments, to create a positively charged protein, all theresidues to be mutated are converted to lysine residues. For example,all the chosen residues for modification are asparagine, glutamine,lysine, and/or arginine, and these residues are mutated into aspartateor glutamate residues. To give but another example, all the chosenresidues for modification are aspartate, glutamate, asparagine, and/orglutamine, and these residues are mutated into lysine. This approachallows for modifying the net charge on the protein to the greatestextent.

In some embodiments, a protein may be modified to keep the net charge onthe modified protein the same as on the unmodified protein. In someembodiments, a protein may be modified to decrease the overall netcharge on the protein while increasing the total number of chargedresidues on the surface. In certain embodiments, the theoretical netcharge is increased by at least +1, at least +2, at least +3, at least+4, at least +5, at least +10, at least +15, at least +20, at least +25,at least +30, at least +35, or at least +40. In certain embodiments, thetheoretical net charge is decreased by at least −1, at least −2, atleast −3, at least −4, at least −5, at least −10, at least −15, at least−20, at least −25, at least −30, at least −35, or at least −40. Incertain embodiments, the chosen amino acids are changed into non-ionic,polar residues (e.g., cysteine, serine, threonine, tyrosine, glutamine,asparagine).

In certain embodiments, the amino acid residues mutated to charged aminoacids residues are separated from each other by at least 1, at least 2,at least 3, at least 4, at least 5, at least 6, at least 7, at least 8,at least 9, at least 10, at least 15, at least 20, or at least 25 aminoacid residues. In certain embodiments, the amino acid residues mutatedto positively charged amino acids residues (e.g., lysine) are separatedfrom each other by at least 1, at least 2, at least 3, at least 4, atleast 5, at least 6, at least 7, at least 8, at least 9, at least 10, atleast 15, at least 20, or at least 25 amino acid residues. Typically,these intervening sequence are based on the primary amino acid of theprotein being supercharged. In certain embodiments, only two chargedamino acids are allowed to be in a row in a supercharged protein. Incertain embodiments, only three or fewer charged amino acids are allowedto be in a row in a supercharged protein. In certain embodiments, onlyfour or fewer charged amino acids are allowed to be in a row in asupercharged protein. In certain embodiments, only five or fewer chargedamino acids are allowed to be in a row in a supercharged protein.

In certain embodiments, a surface exposed loop, helix, turn, or othersecondary structure may contain only 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10charged residues. Distributing the charged residues over the proteintypically is thought to allow for more stable proteins. In certainembodiments, only 1, 2, 3, 4, or 5 residues per 15-20 amino acids of theprimary sequence are mutated to charged amino acids (e.g., lysine). Incertain embodiments, on average only 1, 2, 3, 4, or 5 residues per 10amino acids of the primary sequence are mutated to charged amino acids(e.g., lysine). In certain embodiments, on average only 1, 2, 3, 4, or 5residues per 15 amino acids of the primary sequence are mutated tocharged amino acids (e.g., lysine). In certain embodiments, on averageonly 1, 2, 3, 4, or 5 residues per 20 amino acids of the primarysequence are mutated to charged amino acids (e.g., lysine). In certainembodiments, on average only 1, 2, 3, 4, or 5 residues per 25 aminoacids of the primary sequence are mutated to charged amino acids (e.g.,lysine). In certain embodiments, on average only 1, 2, 3, 4, or 5residues per 30 amino acids of the primary sequence are mutated tocharged amino acids (e.g., lysine).

In certain embodiments, at least 50%, at least 60%, at least 70%, atleast 80%, or at least 90% of the mutated charged amino acid residues ofthe supercharged protein are solvent exposed. In certain embodiments, atleast 50%, at least 60%, at least 70%, at least 80%, or at least 90% ofthe mutated charged amino acids residues of the supercharged protein areon the surface of the protein. In certain embodiments, less than 5%,less than 10%, less than 20%, less than 30%, less than 40%, less than50% of the mutated charged amino acid residues are not solvent exposed.In certain embodiments, less than 5%, less than 10%, less than 20%, lessthan 30%, less than 40%, less than 50% of the mutated charged amino acidresidues are internal amino acid residues.

In some embodiments, amino acids are selected for modification using oneor more predetermined criteria. For example, to generate asuperpositively charged protein, AvNAPSA values may be used to identifyaspartic acid, glutamic acid, asparagine, and/or glutamine residues withAvNAPSA values below a certain threshold value, and one or more (e.g.,all) of these residues may be changed to lysines. In some embodiments,to generate a superpositively charged protein, AvNAPSA is used toidentify aspartic acid, glutamic acid, asparagine, and/or glutamineresidues with AvNAPSA below a certain threshold value, and one or more(e.g., all) of these are changed to arginines. In some embodiments, togenerate a supernegative protein, AvNAPSA is used to identifyasparagine, glutamine, lysine, and/or arginine residues with AvNAPSAvalues below a certain threshold value, and one or more (e.g., all) ofthese are changed to aspartic acid residues. In some embodiments, togenerate a supernegatively charged protein, AvNAPSA is used to identifyasparagine, glutamine, lysine, and/or arginine residues with AvNAPSAvalues below a certain threshold value, and one or more (e.g., all) ofthese are changed to glutamic acid residues. In some embodiments, thecertain threshold value is 40 or below. In some embodiments, the certainthreshold value is 35 or below. In some embodiments, the certainthreshold value is 30 or below. In some embodiments, the certainthreshold value is 25 or below. In some embodiments, the certainthreshold value is 20 or below. In some embodiments, the certainthreshold value is 19 or below, 18 or below, 17 or below, 16 or below,15 or below, 14 or below, 13 or below, 12 or below, 11 or below, 10 orbelow, 9 or below, 8 or below, 7 or below, 6 or below, 5 or below, 4 orbelow, 3 or below, 2 or below, or 1 or below. In some embodiments, thecertain threshold value is 0.

In some embodiments, solvent-exposed residues are identified by thenumber of neighbors. In general, residues that have more neighbors areless solvent-exposed than residues that have fewer neighbors. In someembodiments, solvent-exposed residues are identified by half sphereexposure, which accounts for the direction of the amino acid side chain(Hamelryck, 2005, Proteins, 59:8-48; incorporated herein by reference).In some embodiments, solvent-exposed residues are identified bycomputing the solvent exposed surface area, accessible surface area,and/or solvent excluded surface of each residue. See, e.g., Lee et al.,J. Mol. Biol. 55(3):379-400, 1971; Richmond, J. Mol. Biol. 178:63-89,1984; each of which is incorporated herein by reference.

The desired modifications or mutations in the protein may beaccomplished using any techniques known in the art. Recombinant DNAtechniques for introducing such changes in a protein sequence are wellknown in the art. In certain embodiments, the modifications are, made bysite-directed mutagenesis of the polynucleotide encoding the protein.Other techniques for introducing mutations are discussed in MolecularCloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch, andManiatis (Cold Spring Harbor Laboratory Press: 1989); the treatise,Methods in Enzymology (Academic Press, Inc., N.Y.); Ausubel et al.Current Protocols in Molecular Biology (John Wiley & Sons, Inc., NewYork, 1999); each of which is incorporated herein by reference. Themodified protein is expressed and tested. In certain embodiments, aseries of variants is prepared, and each variant is tested to determineits biological activity and its stability. The variant chosen forsubsequent use may be the most stable one, the most active one, or theone with the greatest overall combination of activity and stability.After a first set of variants is prepared an additional set of variantsmay be prepared based on what is learned from the first set. Variantsare typically created and overexpressed using recombinant techniquesknown in the art.

Supercharged proteins may be further modified. Proteins includingsupercharged proteins can be modified using techniques known to those ofskill in the art. For example, supercharged proteins may be modifiedchemically or biologically. One or more amino acids may be added,deleted, or changed from the primary sequence. For example, apolyhistidine tag or other tag may be added to the supercharged proteinto aid in the purification of the protein. Other peptides or proteinsmay be added onto the supercharged protein to alter the biological,biochemical, and/or biophysical properties of the protein. For example,an endosomolytic peptide may be added to the primary sequence of thesupercharged protein, or a targeting peptide may be added to the primarysequence of the supercharged protein. Other modifications of thesupercharged protein include, but are not limited to, post-translationalmodifications (e.g., glycosylation, phosphorylation, acylation,lipidation, farnesylation, acetylation, proteolysis, etc.). In certainembodiments, the supercharged protein may be modified to reduce itsimmunogenicity. In certain embodiments, the supercharged protein may bemodified to enhance its ability to delivery a nucleic acid to a cell. Incertain embodiments, the supercharged protein may be conjugated to apolymer. For example, the protein may be PEGylated by conjugating theprotein to a polyethylene glycol (PEG) polymer. One of skill in the artcan envision a multitude of ways of modifying the supercharged proteinwithout departing from the scope of the present invention. Methodsdescribed herein allow supercharging proteins by imposing changes in theprotein sequence of the protein to be supercharged. Other methods can beused to produce supercharged proteins without modification of theprotein sequence. For example, moeties that alter charge can be attachedto proteins (e.g., by chemical or enzymatic reactions) to providesurface charge to achieve supercharging. In certain embodiments, themethod of modifying proteins described in Shaw et al., Protein Science17:1446, 2008 is used to supercharge a protein.

The international PCT patent application (PCT/US07/70254, filed Jun. 1,2007, published as WO 2007/143574 on Dec. 13, 2007, entitled “ProteinSurface Remodeling;” incorporated herein by reference) and U.S.provisional patent applications (U.S. Ser. No. 60/810,364, filed Jun. 2,2006, and U.S. Ser. No. 60/836,607, filed Aug. 9, 2006; both of whichare entitled “Protein Surface Remodeling”; and both of which areincorporated herein by reference) describe the design and creation ofvariants of several different proteins. These variants have been shownto be more stable and to retain their fluorescence. For example, a greenfluorescent protein (GFP) from Aequorea victoria is described in GenBankAccession Number P42212, incorporated herein by reference. The aminoacid sequence of this wild type GFP is as follows:

(SEQ ID NO: 1) MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLE FVTAAGITHGMDELYK

Wild type GFP has a theoretical net charge of −7. Variants with atheoretical net charge of −29, −30, −25, +15, +25, +36, +48, and +49have been created. Even after heating the +36 GFP to 95° C., 100% of thevariant protein is soluble and the protein retains ≧70% of itsfluorescence. +15, +25, and +36 GFP have been found to be particularlyuseful in transfecting nucleic acids into cells. In particular, +36 GFPhas been found to be highly cell permeable and capable of efficientlydelivering nucleic acids into a variety of mammalian cells, includingcell lines resistant to transfection using other transfection methods.Therefore, GFP or other proteins with a net charge of at least +25, atleast +30, at least +35, or at least +40 are thought to be particularlyuseful in transfecting nucleic acids into a cell.

The capacity of supercharged GFPs to penetrate mammalian cells increasesas a function of theoretical net charge even at charges as high as +25and +36. U.S. provisional patent applications, U.S. Ser. No. 61/173,430and 61/105,287, and PCT application, PCT/US2009/041984, each of which isincorporated herein by reference. This property contrasts with peptidicprotein transduction domains (PTDs) such as arginine oligomers, whichhave been observed to lose mammalian cell penetration ability when theirnet theoretical charge exceeds +15 (Mitchell et al., J. Pept. Res. 56,318-325, 2000). The cell penetration potency of +36 GFP may therefore bedue in part to charge distribution over a comparatively large area,which may provide a more stable and extended cationic surface thatinteracts more effectively with cells (e.g., mammalian cells).

The significantly greater potency of +36 GFP mediated protein deliverycompared with that of Tat and Arg₉ may also be a consequence of itsstructure. Unlike the globular β-barrel of GFP, the nine-residue Tatpeptide and Arg₉ peptides are unlikely to be well-folded, although theformer has been observed to adopt a structure similar to a poly(proline)II helix (Ruzza et al., J. Pept. Sci. 10, 423-426, 2004).

Accordingly, in some embodiments, particularly useful superchargedproteins are proteins that allow for a charge distribution or a surfacecharge density similar to that of +36 GFP. Further, in some embodiments,particularly useful supercharged proteins are proteins exhibiting astable folded structure not easily perturbed by supercharging, thusallowing the supercharged protein to be well folded. In someembodiments, particularly useful supercharged proteins are proteinssharing a structural feature with a supercharged protein describedherein, for example, a globular structure, or a β-barrel structure.Protein folding, protein fold structure stability and perturbation ofprotein folding by substitution of specific amino acids with differentlycharged amino acids, charge distribution, and surface charge density canbe modeled in silico by methods and algorithms provided herein andothers known to those of skill in the art. Accordingly, it will beapparent to those of skill in the art from no more than routineexperimentation, whether a supercharged protein in question will be wellfolded. Thus, those of skill in the art will be able to identify from agiven amino acid sequence whether a given supercharged protein will beuseful for a system for cell delivery or methods as described herein.

The amino acid sequences of the variants of GFP that have been createdinclude:

GFP-NEG7 (SEQ ID NO: 2) MGHHHHHHGGASKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVL LEFVTAAGITHGMDELYKGFP-NEG25 (SEQ ID NO: 3) MGHHHHHHGGASKGEELFTGVVPILVELDGDVNGHEFSVRGEGEGDATEGELTLKFICTTGELPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHDVYITADKQENGIKAEFEIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDDHYLSTESALSKDPNEDRDHMVL LEFVTAAGIDHGMDELYKGFP-NEG29 (SEQ ID NO: 4) MGHHHHHHGGASKGEELFDGEVPILVELDGDVNGHEFSVRGEGEGDATEGELTLKFICTTGELPVPWPTLVTTLTYGVQCFSRYPDHMDQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHDVYITADKQENGIKAEFEIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDDHYLSTESALSKDPNEDRDHMVL LEFVTAAGIDHGMDELYKGFP-NEG30 (SEQ ID NO: 5) MGHHHHHHGGASKGEELFDGVVPILVELDGDVNGHEFSVRGEGEGDATEGELTLKFICTTGELPVPWPTLVTTLTYGVQCFSDYPDHMDQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHDVYITADKQENGIKAEFEIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDDHYLSTESALSKDPNEDRDHMVL LEFVTAAGIDHGMDELYKGFP-POS15 (SEQ ID NO: 6) MGHHHHHHGGASKGERLFTGVVPILVELDGDVNGHKFSVRGEGEGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPEGYVQERTISFKKDGTYKTRAEVKFEGRTLVNRIELKGRDFKEKGNILGHKLEYNFNSHNVYITADKRKNGIKANFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSALSKDPKEKRDHMVL LEFVTAAGITHGMDELYKGFP-POS25 (SEQ ID NO: 18) MGHHHHHFIGGASKGERLFTGVVPILVELDGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGTYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHNVYITADKRKNGIKANFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSALSKDPKEKRDHMV LLEFVTAAGITHGMDELYKGFP-POS36 (SEQ ID NO: 7) MGHHHHHHGGASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVL LEFVTAAG1KHGRDERYKGFP-POS42 (SEQ ID NO: 8) MGHHHHHHGGRSKGKRLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRKHYLSTRSKLSKDPKEKRDHMVL LEFVTAAGIKHGRKERYKGFP-POS48 (SEQ ID NO: 9) MGHHHHHHGGRSKGKRLFRGKVPILVKLKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFKGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLAKHYQQNTPIGRGPVLLPRKHYLSTRSKLSKDPKEKRDHMVL LEFVTAAGIKHGRKERYKGFP-POS49 (SEQ ID NO: 10) MGHHHHHHGGRSKGKRLFRGKVPILVKLKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFKGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLAKHYQQNTPIGRGPVLLPRKHYLSTRSKLSKDPKEKRDHMVL KEFVTAAGIKHGRKERYK

In order to promote the escape of the supercharged protein, or deliveredagent, e.g., nucleic acid, from the endosomes, a supercharged proteinmay be fused to or associated with a protein, peptide, or other entityknown to enhance endosome degradation or lysis of the endosome. Incertain embodiments, the peptide is hemagglutinin 2 (HA2) peptide whichis know to enhance endosome degradation. In certain particularembodiments, HA2 peptide is fused to supercharged GFP (e.g., +36 GFP).In certain particular embodiments, the fused protein is of the sequence:

+36 GFP-HA2 (SEQ ID NO: 19)MGHHHHHHGGASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYKGSAGSAAGSGEFGLFGAIAGFIENGWEG MIDG

In certain embodiments, the endosomolytic peptide is melittin peptide(GIGAVLKVLTTGLPALISWIKRKRQQ, SEQ ID NO: 20) (Meyer et al. JACS 130(11):3272-3273, 2008; which is incorporated herein by reference). In certainembodiments, the melittin peptide is modified by one, two, three, four,or five amino acid substitutions, deletions, and/or additions. Incertain embodiments, the melittin peptide is of the sequence:CIGAVLKVLTTGLPALISWIKRKRQQ (SEQ ID NO: 21). In certain particularembodiments, the melittin peptide is fused to supercharged GFP (e.g.,+36 GFP).

In certain embodiments, the endosomolytic peptide is penetratin peptide(RQIKIWFQNRRMKWKK-amide, SEQ ID NO: 22), bovine PrP (1-30) peptide(MVKSKIGSWILVLFVAMWSDVGLCKKRPKP-amide, SEQ ID NO: 23), MPGΔ^(NLS)peptide (which lacks a functional nuclear localization sequence becauseof a K->S substitution) (GALFLGWLGAAGSTMGAPKSKRKV, SEQ ID NO: 24), TP-10peptide (AGYLLGKINLKALAALAKKIL-amide, SEQ ID NO: 25), and/or EB1 peptide(LIRLWSHLIHIWFQNRRLKWKKK-amide, SEQ ID NO: 26) (Lundberg et al. 2007,FASEB J. 21:2664; incorporated herein by reference). In certainembodiments, the penetratin, PrP (1-30), MPG, TP-10, and/or EB1 peptideis modified by one, two, three, four, or five amino acid substitutions,deletions, and/or additions. In certain particular embodiments, the PrP(1-30), MPG, TP-10, and/or EB1 peptide is fused to supercharged GFP(e.g., +36 GFP).

Other peptides or proteins may also be fused to the superchargedprotein. For example, a targeting peptide may be fused to thesupercharged protein in order to selectively deliver the superchargedprotein, or associated agent, e.g., nucleic acid, to a particular celltype. Peptides or proteins that enhance the transfection of the nucleicacid may also be used. In certain embodiments, the peptide fused to thesupercharged protein is a peptide hormone. In certain embodiments, thepeptide fused to the supercharged protein is a peptide ligand.

As would be appreciated by one of skill in the art, homologous proteinsare also considered to be within the scope of this invention. Forexample, any protein that includes a stretch of about 20, about 30,about 40, about 50, or about 100 amino acids which are about 40%, about50%, about 60%, about 70%, about 80%, about 90%, about 95%, or about100% identical to any of the above sequences can be utilized inaccordance with the invention. Alternatively or additionally, additionand deletion variants can be utilized in accordance with the invention.In certain embodiments, any GFP with a mutated residue as shown in anyof the above sequences can be utilized in accordance with the invention.In certain embodiments, a protein sequence to be utilized in accordancewith the invention includes 2, 3, 4, 5, 6, 7, 8, 9, 10, or moremutations as shown in any of the sequences above.

Other proteins that may be supercharged and used, e.g., in the deliveryof agents, e.g., nucleic acids, include other GFP-style fluorescentproteins. In certain embodiments, the supercharged protein is asupercharged version of blue fluorescent protein. In certainembodiments, the supercharged protein is a supercharged version of cyanfluorescent protein. In certain embodiments, the supercharged protein isa supercharged version of yellow fluorescent protein. Exemplaryfluorescent proteins include, but are not limited to, enhanced greenfluorescent protein (EGFP), AcGFP, TurboGFP, Emerald, Azami Green,ZsGreen, EBFP, Sapphire, T-Sapphire, ECFP, mCFP, Cerulean, CyPet,AmCyan1, Midori-Ishi Cyan, mTFP1 (Teal), enhanced yellow fluorescentprotein (EYFP), Topaz, Venus, mCitrine, YPet, PhiYFP, ZsYellow1,mBanana, Kusabira Orange, mOrange, dTomato, dTomato-Tandem, DsRed,DsRed2, DsRed-Express (T1), DsRed-Monomer, mTangerine, mStrawberry,AsRed2, mRFP1, JRed, mCherry, HcRed1, mRaspberry, HcRed1, HcRed-Tandem,mPlum, and AQ143.

Yet other proteins that may be supercharged and used, e.g., in thedelivery of an agent, e.g., nucleic acids, include histone components orhistone-like proteins. In certain embodiments, the histone component ishistone linker H1. In certain embodiments, the histone component is corehistone H2A. In certain embodiments, the histone component is corehistone H2B. In certain embodiments, the histone component is corehistone H3. In certain embodiments, the histone component is corehistone H4. In certain embodiments, the protein is the archaelhistone-like protein, HPhA. In certain embodiments, the protein is thebacterial histone-like protein, TmHU.

Other proteins that may be supercharged and used, e.g., in the deliveryof an agent, e.g., nucleic acids, include high-mobility-group proteins(HMGs). In certain embodiments, the protein is HMG1. In certainembodiments, the protein is HMG17. In certain embodiments, the proteinis HMG1-2.

Other proteins that may be supercharged and used, e.g., in the deliveryof an agent, e.g., nucleic acids, include anti-cancer agents, such asanti-apoptotic agents, cell cycle regulators, etc.

Other proteins that may be supercharged and used, e.g., in the deliveryof an agent, e.g., nucleic acids, are enzymes, including, but notlimited to, amylases, pectinases, hydrolases, proteases, glucoseisomerase, lipases, phytases, etc. In some embodiments, proteins thatmay be supercharged and used, e.g., in the delivery of an agent, e.g.,nucleic acids, are lysosomal enzymes, including, but not limited to,alglucerase, imiglucerase, agalsidase beta, α-1-iduronidase, acidα-glucosidase, iduronate-2-sulfatase, N-acetylgalactosamine-4-sulfatase,etc. (Wang et al., 2008, NBT, 26:901-08; incorporated herein byreference).

Other proteins that may be supercharged and used, e.g., in the deliveryof an agent, e.g., nucleic acids, are presented in Table 1. Some of theproteins listed in Table 1 include a listing of residues that may bemodified in order to supercharge those proteins. The identity of theresidues was identified computationally by downloading a PDB file of theprotein of interest. The residues of the pdb file were sorted byascending avNapsa values, and the first 15 ASP, GLU, ASN or GLN residueswere proposed for mutation to LYS.

PDB files, by convention, number amino acids by their order in the wildtype protein. The PDB file, however, may not contain the full lengthwildtype protein. The input protein sequence is the sequence of theamino acids that are included in the PDB. The proposed mutations providethe number of the amino acid in the full length wildtype protein andalso the number in the input protein sequence. The proposed mutationsare provided in the following format: Wildtype residue_Chain:ResidueNumber in Wildtype Protein Chain (Residue Number in InputChain)_Proposed Residue. Wildtype residue refers to the identity of theamino acid in the wild type protein. Chain refers to the designation ofthe peptide chain of the specified mutation. Residue number in wildtypeprotein refers to the number of the amino acid in the designated proteinchain of the specified mutation in the full length wild type protein.Residue number in input chain refers to the number of the amino acid inthe designated protein chain that was included in the analyzed PDB.

TABLE 1 Exemplary Proteins that can be Supercharged15 Possible Mutations to Generate Positively Supercharged ProteinPROTEIN TYPE Wildtype residue_Chain: Residue Number in Protein SubtypeWildtype Protein Chain (Residue Protein (PDB #) Input Protein SequenceNumber in InputChain)_Proposed Residue MEMBRANE PROTEINS Cystic fibrosisChain A:  transmembrane STTEVVMENVTAFWEEGFGELFEASP_A: 513(102)_LYS, GLU_A: 514(103)_LYS, conductanceKAKGTPVLKDINFKIERGQLLAV GLU_A: 656(238)_LYS, GLU_A: 474(64)_LYS,regulator(CFTR) AGSTGAGKTSLLMMIMGELEPSEGLU_A: 528(117)_LYS, GLU_A: 535(124)_LYS, (2bbs) GKIKHSGRISFCSQNSWIMPGTIASN_A: 635(220)_LYS, ASN_A: 494(84)_LYS, KENIIGVSYDEYRYRSVIKACQLASP_A: 579(164)_LYS, ASP_A: 639(224)_LYS, EEDISKFAEKDNIVLITLSGGQRGLN_A: 652(234)_LYS, GLU_A: 402(15)_LYS, ARISLARAVYKDADLYLLDSPFGASP_A: 565(150)_LYS, GLU_A: 664(246)_LYS, YLDVLTEKEIFESCVCKLMANKTGLU_A: 403(16)_LYS, RILVTSKMEHLKKADKILILHEG SSYFYGTFSELQNLRPDFSSKLMSFDQFSAERRNSILTETLHRFSL (SEQ ID NO: 27) RECEPTORS Cytokine ReceptorsType I EPO receptor (leer) Chain B:  DPKFESKAALLAARGPEELLCFTASP_B: 8(1)_LYS, ASP_B: 133(126)_LYS, ERLEDLVCFWEEAASAGVGPGQYASP B: 61(54)_LYS, GLU_B: 134(127)_LYS, SFSYQLEDEPWKLCRLHQAPTARGLU_B: 147(140)_LYS, ASN_B: 185(178)_LYS, GAVRFWCSLPTADTSSFVPLELRGLU_B: 12(5)_LYS, GLU_B: 62(55)_LYS, VTAASGAPRYHRVIHINEVVLLDGLU_B: 24(17)_LYS, GLN_B: 164(157)_LYS, APVGLVARLADESGHVVLRWLPPGLN_B: 170(163)_LYS, GLU_B: 60(53)_LYS, PETPMTSHIRYEVDVSAGQGAGSGLU_B: 25(18)_LYS, GLN_B: 52(45)_LYS, VQRVEILEGRTECVLSNLRGRTRGLU_B: 173(166)_LYS YTFAVRARMAEPSFGGFWSEWSE PVSL (SEQ ID NO: 28)GM-CSF receptor G-CSF receptor Chain B: (2d9q) CGHISVSAPIVHLGDPITASCIIASN_B: 84(82)_LYS, ASP_B: 57(55)_LYS, KQNCSHLDPEPQILWRLGAELQPASP_B: 213(211)_LYS, ASP_B: 158(156)_LYS, GGRQQRLSDGTQESIITLPHLNHGLN_B: 222(213)_LYS, GLU_B: 253(244)_LYS, TQAFLSCSLNWGNSLQILDQVELASP_B: 149(147)_LYS, GLN_B: 234(225)_LYS, RAGYPPAIPHNLSCLMNLTTSSLGLN_B: 160(158)_LYS, GLU_B: 270(261)_LYS, ICQWEPGPETHLPTSFTLKSFKSGLU_B: 45(43)_LYS, GLN_B: 145(143)_LYS, RGNCQTQGDSILDCVPKDGQSHCGLU_B: 308(299)_LYS, ASN_B: 28(26)_LYS, SIPRKHLLLYQNMGIWVQAENALGLU_B: 93(91)_LYS GTSMSPQLCLDPMDVVKLEPPML RTMDPQAGCLQLSWEPWQPGLHINQKCELRHKPQRGEASWALVGPL PLEALQYELCGLLPATAYTLQIR CIRWPLPGHWSDWSPSLELRTTE(SEQ ID NO: 29) Growth hormone Chain B:  receptor(laxi)EPKFTKCRSPERETFSCHWTDEG ASN_B: 72(33)_LYS, GLN_B: 166(121)_LYS,PIQLFYTRRNEWKECPDYVSAGE GLU_B: 183(138)_LYS, ASP_B: 190(145)_LYS,NSCYFNSSFTSIAIPYCIKLTSN GLU_B: 79(34)_LYS, GLU_B: 32(1)_LYS,GGTVDEKCFSVDEIVQPDPPIAL ASP_B: 52(21)_LYS, GLU_B: 61(22)_LYS,NWTLLNVSLTGIHADIQVRWEAP ASN_B: 182(137)_LYS, ASN_B: 114(69)_LYS,RNADIQKGWMVLEYELQYKEVNE ASN_B: 218(173)_LYS, GLU_B: 91(46)_LYS,TKWKMMDPILTTSVPVYSLKVDK ASN_B: 162(117)_LYS, ASN_B: 97(52)_LYS,EYEVRVRSKQRNSGNYGEFSEVL ASN_B: 143(98)_LYS YVTLPQM (SEQ ID NO: 30)Type II Interferon receptors Immunoglobulin superfamily receptorsIL-1 receptor Chain B:  CKEREEKIILVSSANEIDVRPCPASN_B: 30(25)_LYS, ASN_B: 32(27)_LYS, LNPNEHKGTITWYKDDSKTPVSTASN_B: 102(97)_LYS, ASN_B: 135(130)_LYS, EQASRIHQHKEKLWFVPAKVEDSASP_B: 253(248)_LYS, ASP_B: 254(249)_LYS, GHYYCVVRNSSYCLRIKISAKFVASP_B: 153(148)_LYS, GLU_B: 252(247)_LYS, ENEPNLCYNAQAIFKQKLPVAGDGLU_B: 8(3)_LYS, ASP_B: 44(39)_LYS, GGLVCPYMEFFKNENNELPKLQWGLU_B: 72(67)_LYS, ASN_B: 136(131)_LYS, YKDCKPLLLDNIHFSGVKDRLIVGLU_B: 137(132)_LYS, ASN_B: 204(199)_LYS, MNVAEKHRGNYTCHASYTYLGKQASN_B: 269(264)_LYS YPITRVIEFITLEENKPTRPVIV SPANETMEVDLGSQTQLICNVTGQLSDIAYWKWNGSVIDEDDPYLG EDYYSVENPANICRRSTLITVLN ISEIESRFYKHPFTCFAKINITHGIDAAYIQLIYPVT (SE9 ID NO: 31) C-kit receptor TNF receptor familyTNF alpha receptor Chain A: (CD120) (lext) (SEQ ID NO: 153)GLU_A: 171(159)_LYS, ASN_A: 172(160)_LYS, SVCPQGKYIHPPQNNSICCTKCHKGGLN_B: 24(14)_LYS, GLN_A: 24(12)_LYS, TYLYNDCPGPGQDTDCRECESGSGLU_A: 109(97)_LYS, ASN_A: 25(13)_LYS, FTASENHLRHCLSCSKCRKEMGQGLN_A: 169(157)_LYS, ASN_A: 23(15)_LYS, VEISSCTVDRDTVCGCRICNQYRHGLU_B: 109(99)_LYS, ASN_A: 110(98)_LYS, YWSENLFQCFNCSLCLNGTVHLSGLN_B: 48(38)_LYS, GLN_A: 17(5)_LYS, CQEKQNTVCTCHAGFFLRENECVASN_A: 26(14)_LYS, GLN_A: 48(36)_LYS, SCSNCKKSLECTICLCLPQIENGLN_B: 17(7)_LYS Chain B:  MDSVCPQGKYIHPQNNSICCTKCHKGTYLYNDCPGPGQDTDCRECE SGSFTASENHLRHCLSCSKCRKE MGQVEISSCTVDRDTVCGCRKNQYRHYWSENLFQCFNCSLCLNGTV HLSCQEKQNTVCTCHAGFFLREN ECVSCSNCKKSLECTKLCLP(SEQ ID NO: 32) Lymphotoxin β Chain A:  receptor(lrf3)NTGLLESQLSRHDQMLSVHDIRL ASN_A: 313(1)_LYS, ASP_A: 487(175)_LYS,ADMDLRFQVLETASYNGVLIWKI ASN_A: 453(141)_LYS, GLU_A: 463(151)_LYS,RDYICRRKQEAVMGKTLSLYSQP ASP_A: 500(188)_LYS, GLU_A: 318(6)_LYS,FYTGYFGYKMCARVYLNGDGMGK GLN_A: 320(8)_LYS, ASP_A: 325(13)_LYS,GTHLSLFFVIMRGEYDALLPWPF GLU_A: 346(34)_LYS, GLU_A: 417(105)_LYS,KQKVTLMLMDQGSSRRHLGDAFK ASN_A: 481(169)_LYS, ASP_A: 503(191)_LYS,PDPNSSSFKICPTGEMNIASGCP GLN_A: 326(14)_LYS, ASP_A: 337(25)_LYS,VFVAQTYLENGTYIKDDTIFIKV ASP_A: 339(27)_LYS IVDTSDLPDP (SEQ ID NO: 33)CD40L(laly) Chain A:  GDQNPQIAAHVISEASSKTTSVLASP_A: 117(2)_LYS, GLN_A: 118(3)_LYS, QWAEKGYYTMSNNLVTLENGKQLASN_A : 119(4)_LYS, ASN_A: 151(36)_LYS, TVKRQGLYYIYAQVITCSNREASASN_A: 157(42)_LYS, GLN_A: 166(51)_LYS, SQAPPIASLCLKSPGRFERILLRGLN_A: 186(71)_LYS, GLU_A: 202(87)_LYS, AANTHSSAICPCGQQSIHLGGVFGLU_A: 230(115)_LYS, GLN_A: 121(6)_LYS, ELQPGASVFVNVIDPSQVSHGTGASN_A: 150(35)_LYS, GLU_A: 156(41)_LYS, FTSFGLLKLASN_A: 210(95)_LYS, GLN_A: 220(105)_LYS, (SEQ ID NO: 34)GLU_A: 182(67)_LYS Chemokine receptors IL-8 receptor CCR1 CXCR4TGF beta receptors TGF beta receptors 1, Chain A:  2, 3 (lvjy)IARTIVLQESIGKGRFGEVWRGK ASN_A: 344(144)_LYS, ASN_A: 456(252)_LYS,WRGEEVAVKIFSSREERSWFREA ASN_A: 270(70)_LYS, GLN_A: 324(124)_LYS,EIYQTVMLRHENILGFIAADNKD GLN_A: 448(244)_LYS, GLU_A: 227(27)_LYS,NGTWTQLWLVSDYHEHGSLFDYL ASP_A: 366(166)_LYS, ASP_A: 430(226)_LYS,NRYTVTVEGMIKLALSTASGLAH ASP_A: 435(231)_LYS, GLN_A: 498(294)_LYS,LHMEIVGTQGKPAIAHRDLKSKN GLN_A: 208(8)_LYS, ASP_A: 269(69)_LYS,ILVKKNGTCCIADLGLAVRHDSA GLU_A: 447(243)_LYS, ASN_A: 453(249)_LYS,TDTIDIRVGTKRYMAPEVLDDSI GLN_A: 494(290)_LYS MKHFESFKRADIYAMGLVFWEIARRCSIGGIHEDYQLPYYDLVPSD PSVEEMRKVVCEQKLRFNIPNRW QSCEALRVMAKIMRECWYANGAARLTALRIKKTLSQLSQQEGIKM (SEQ ID NO: 35) TRANSCRIPTION FACTORS p53 (2vuk)Chain A:  SVPSQKTYQGSYGFRLGFLHSGTASN_A: 210(115)_LYS, ASN_A: 288(193)_LYS, AKSVTCTYSPALNKLFCQLAKTCGLN_B: 167(73)_LYS, ASN_B: 210(116)_LYS, PVQLWVDSTPPPGTRVRAMAIYKASN_B: 288(194)_LYS, GLU_A: 287(192)_LYS, QSQHMTEVVRRCPHHERCSDSDGGLU_B: 287(193)_LYS, ASP_A: 208(113)_LYS, LAPPQHLIRVEGNLRAEYLDDRNGLU_A: 224(129)_LYS, ASP_B: 208(114)_LYS, TFRHSVVVPCEPPEVGSDCTTIHGLU_B: 224(130)_LYS, ASP_A: 148(53)_LYS, YNYMCYSSCMGGMNRRPILTIITASP_A: 186(91)_LYS, ASP_B: 148(54)_LYS, LEDSSGNLLGRDSFEVRVCACPGASN_A: 131(36)_LYS RDRRTEEENLR (SEQ ID NO: 36) Chain B: SSVPSQKTYQGSYGFRLGFLHSG TAKSVTCTYSPALNKLFCQLAKT CPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSDSD GLAPPQHLIRVEGNLRAEYLDDR NTFRHSVVVPCEPPEVGSDCTTIHYNYMCYSSCMGGMNRRPILTII TLEDSSGNLLGRDSFEVRVCACP GRDRRTEEENLR(SEQ ID NO: 37) NF-kappaB(2o61) Chain B:  MDGPYLQILEQPKQRGFRFRYVCASP_B: 38(2)_LYS, ASN_B: 75(39)_LYS, EGPSHGGLPGASSEKNKKSYPQVASN_B: 288(252)_LYS, GLU_B: 287(251)_LYS, KICNYVGPAKVIVQLVTNGKNIHASP_B: 188(152)_LYS, GLU_B: 286(250)_LYS, LHAHSLVGKHCEDGICTVTAGPKASP_B: 318(282)_LYS, GLU_B: 60(24)_LYS, DMVVGFANLGILHVTKKKVFETLGLU_B: 73(37)_LYS, GLN_B: 185(149)_LYS, EARMTEACIRGYNPGLLVHPDLAASP_B: 220(184)_LYS, ASP_B: 336(300)_LYS, YLQAEGGGDRQLGDREKELIRQAASP_B: 172(136)_LYS, GLU_B: 179(143)_LYS, ALQQTKEMDLSVVRLMFTAFLPDGLU_B: 192(156)_LYS STGSFTRRLEPVVSDAIYDSKAP NASNLKIVRMDRTAGCVTGGEEIYLLCDKVQKDDIQIRFYEEEENG GVWEGFGDFSPTDVHRQFAIVFK TPKYKDINITKPASVFVQLRRKSDLETSEPKPFLYYPE (SEQ ID NO: 38) Additional exemplarytranscript. factors can be found in Table 2 ENZYMES Misc enzymesTissue plasminogen Chain A: activator(lrtf) TTCCGLRQYASP_B: 110(102)_LYS, GLN_B: 60(47)_LYS, (SEQ ID NO: 39)GLU_B: 60(48)_LYS, ASP_B: 110(102)_LYS, Chain B: ASP_B: 204(204)_LYS, ASP_B: 97(88)_LYS, IKGGLFADIASHPWQAAIFAKHHASP_B: 127(122)_LYS, ASN_B: 186(186)_LYS, RRGGERFLCGGILISSCWILSAAGLN_B: 60(47)_LYS, GLU_B: 60(48)_LYS, HCFQQQQQEEEEERRRRRFFFFFASN_B: 173(170)_LYS, ASP_B: 240(240)_LYS, PPPPPPHHLTVILGRTYRVVPGEGLN_B: 60(47)_LYS, GLU_B: 60(48)_LYS, EEQKFEVEKYIVHKEFDDDTYDNGLU_B: 78(69)_LYS DIALLQLKSSSSSDDDDDSSSSS SSSSSRRRRRCAQESSVVRTVCLPPADLQLPDWTECELSGYGKHEA LSPFYSERLKEAHVRLYPSSRCT TTSSSQQQHLLNRTVTDNMLCAGDTTTRRRSSSNNNLHDACQGDSG GPLVCLNDGRMTLVGIISWGLGC GGQQKDVPGVYTKVTNYLDWIRDNMRP (SEQ ID NO: 40) Factor IX Chain A:  VVGGEDAKPGQFPWQVVLNGKVDASN_A: 95(80)_LYS, ASP_B: 104(19)_LYS, AFCGGSIVNEKWIVTAAHCVEETGLU_A: 60(44)_LYS, GLU_A: 204(194)_LYS, TGVKITVVAGEHNIEETEHTEQKGLU_A: 240(230)_LYS, GLU_B: 119(34)_LYS, RNVIRIIPHHNYNNNAAAAAAINASN_B: 120(35)_LYS, GLU_A: 74(59)_LYS, KYNHDIALLELDEPLVLNSYVTPGLU_A: 75(60)_LYS, ASN_A: 93(78)_LYS, ICIADKEYTTTNNNIIIFLKFGSASN_A: 97(84)_LYS, GLU_A: 127(114)_LYS, GYVSGWGRVFHKGRSALVLQYLRGLU_A: 186(175)_LYS, ASN_B: 105(20)_LYS, VPLVDRATCLRSTKFTIYNNMFCGLU_A: 60(44)_LYS AGGFFHEGGGRRDSCQGDSGGPH VTEVEGTSFLTGIISWGEECAAMMKGKYGIYTKVSRYVNWIKEKTK LT (SEQ ID NO: 41) Chain B: MTCNIKNGRCEQFCKNSADNKVV CSCTEGYRLAENQKSCEPAVPFP CGRVSVSQTSK(SEQ ID NO: 42) deoxyribonuclease I (rhDNase) Enzyme Replacementglucocerebrosidase Chain A:  EFARPCIPKSFGYSSVVCVCNATGLU_A: −1(1)_LYS, GLU_A: 72(71)_LYS, YCDSFDPPALGTFSRYESTRSGRGLN_A: 497(496)_LYS, ASP_A: 27(29)_LYS, RMELSMGPIQANHTGTGLLLTLQASN_A: 59(58)_LYS, GLN_A: 73(72)_LYS, PEQKFQKVKGFGGAMTDAAALNIGLN_A: 143(142)_LYS, GLU_A: 151(150)_LYS LALSPPAQNLLLKSYFSEEGIGYGLU_A: 222(221)_LYS, ASN_A: 270(269)_LYS, NIIRVPMASCDFSIRTYTYADTPGLN_A: 440(439)_LYS, ASP_A: 453(452)_LYS, DDFQLHNFSLPEEDTKLKIPLIHASN_A: 333(332)_LYS, ASN_A: 275(274)_LYS, RALQLAQRPVSLLASPWTSPTWLASN_A: 442(441)_LYS KTNGAVNGKGSLKGQPGDIYHQT WARYFVKFLDAYAEHKLQFWAVTAENEPSAGLLSGYPFQCLGFTPE HQRDFIARDLGPTLANSTHHNVR LLMLDDQRLLLPHWAKVVLTDPEAAKYVHGIAVHWYLDFLAPAKAT LGETHRLFPNTMLFASEACVGSK FWEQSVRLGSWDRGMQYSHSIITNLLYHVVGWTDWNLALNPEGGPN WVRNFVDSPIIVDITKDTFYKQP MFYHLGHFSKFIPEGSQRVGLVASQKNDLDAVALMHPDGSAVVVVL NRSSKDVPLTIKDPAVGFLETIS PGYSIHTYLWHRQ(SEQ ID NO: 43) alpha galactosidase A Chain A:  LDNGLARTPTMGWLHWERFMCNLGLU_A: 103(72)_LYS, GLN_A: 57(26)_LYS, DCQEEPDSCISEKLFMEMAELMVGLU_A: 58(27)_LYS, GLU_A: 178(147)_LYS, SEGWKDAGYEYLCIDDCWMAPQRASP_A: 101(70)_LYS, ASP_A: 175(144)_LYS, DSEGRLQADPQRFPHGIRQLANYGLN_A: 212(181)_LYS, GLN_A: 306(275)_LYS, VHSKGLKLGIYADVGNKTCAGFPGLN_A: 333(302)_LYS, ASP_A: 335(304)_LYS, GSFGYYDIDAQTFADWGVDLLKFGLU_A: 59(28)_LYS, GLN_A: 111(80)_LYS, DGCYCDSLENLADGYKHMSLALNASN_A: 215(184)_LYS, GLU_A: 251(220)_LYS, RTGRSIVYSCEWPLYMWPFQKPNGLU_A: 358(327)_LYS YTEIRQYCNHWRNFADIDDSWKS IKSILDWTSFNQERIVDVAGPGGWNDPDMLVIGNFGLSWNQQVTQM ALWAIMAAPLFMSNDLRHISPQA KALLQDKDVIAINQDPLGKQGYQLRQGDNFEVWERPLSGLAWAVAM INRQEIGGPRSYTIAVASLGKGV ACNPACFITQLLPVKRKLGFYEWTSRLRSHINPTGTVLLQLENTM (SEQ ID NO: 44) arylsulfatase-A Chain A: (iduronidase, α-L-) RPPNIVLIFADDLGYGDLGCYGHASN_A: 350(331)_LYS, GLU_A: 103(84)_LYS, PSSTTPNLDQLAAGGLRFTDFYVGLU_A: 451(428)_LYS, GLN_A: 215(196)_LYS, PVSLPSRAALLTGRLPVRMGMYPASP_A: 216(197)_LYS, GLU_A: 424(405)_LYS, GVLVPSSRGGLPLEEVTVAEVLAASP_A: 267(248)_LYS, GLU_A: 131(112)_LYS, ARGYLTGMAGKWHLGVGPEGAFLASP_A: 411(392)_LYS, GLN_A: 454(431)_LYS, PPHQGFHRFLGIPYSHDQGPCQNGLN_A: 465(442)_LYS, GLN_A: 51(33)_LYS, LTCFPPATPCDGGCDQGLVPIPLASN_A: 158(139)_LYS, ASP_A: 207(188)_LYS, LANLSVEAQPPWLPGLEARYMAFGLN_A: 371(352)_LYS AHDLMADAQRQDRPFFLYYASHH THYPQFSGQSFAERSGRGPFGDSLMELDAAVGTLMTAIGDLGLLEE TLVIFTADNGPETMRMSRGGCSG LLRCGKGTTYEGGVREPALAFWPGHIAPGVTHELASSLDLLPTLAA LAGAPLPNVTLDGFDLSPLLLGT GKSPRQSLFFYPSYPDEVRGVFAVRTGKYKAHFFTQGSAHSDTTAD PACHASSSLTAHEPPLLYDLSKD PGENYNLLGATPEVLQALKQLQLLKAQLDAAVTFGPSQVARGEDPA LQICCHPGCTPRPACCHCP (SEQ ID NO: 45)arylsulfatase B(N- Chain A:  acetylgalactos-amine-SRPPHLVFLLADDLGWNDVGFHG GLU_A: 229(187)_LYS, ASN_A: 188(146)_LYS,4-sulfatase)(lfsu) SRIRTPHLDALAAGGVLLDNYYTGLU_A: 249(207)_LYS, GLU_A: 250(208)_LYS, QPLTPSRSQLLTGRYQIRTGLQHASN_A: 366(324)_LYS, GLN_A: 456(397)_LYS, QIIWPCQPSCVPLDEKLLPQLLKASN_A: 458(399)_LYS, ASP_A: 125(83)_LYS, EAGYTTHMVGKWHLGMYRKECLPASN_A: 225(183)_LYS, ASP_A: 256(214)_LYS, TRRGFDTYFGYLLGSEDYYSHERGLU_A: 490(431)_LYS, GLU_A: 201(159)_LYS, CTLIDALNVTRCALDFRDGEEVAASN_A: 208(166)_LYS, GLN_A: 259(217)_LYS, TGYKNMYSTNIFTKRAIALITNHASN_A: 398(356)_LYS PPEKPLFLYLALQSVHEPLQVPE EYLKPYDFIQDKNRHHYAGMVSLMDEAVGNVTAALKSSGLWNNTVF IFSTDNGGQTLAGGNNWPLRGRK WSLWEGGVRGVGFVASPLLKQKGVKNRELIHISDWLPTLVKLARGH TNGTKPLDGFDVWKTISEGSPSP RIELLHNIDPNFVDSSPCSAFNTSVHAAIRHGNWKLLTGYPGCGYW FPPPSQYNVSEIPSSDPPTKTLW LFDIDRDPEERHDLSREYPHIVTKLLSRLQFYHKHSVPVYFPAQDP RCDPKATGVWGPWM (SEQ ID NO: 46) galactosylcera-midase beta-galactosidase beta-hexosaminidase Chain A:  A (2gix)LWPWPQNFQTSDQRYVLYPNNFQ GLN_A: 528(492)_LYS, GLU_A: 151(115)_LYS,FQYDVSSAAQPGCSVLDEAFQRY ASP_A: 123(87)_LYS, GLU_A: 523(487)_LYS,RDLLFGTLEKNVLVVSVVTPGCN GLU_A: 527(491)_LYS, GLU_A: 111(75)_LYS,QLPTLESVENYTLTINDDQCLLL GLN_A: 237(201)_LYS, ASP_A: 34(12)_LYS,SETVWGALRGLETFSQLVWKSAE ASN_A: 43(21)_LYS, ASN_A: 42(20)_LYS,GTFFINKTEIEDFPRFPHRGLLL GLN_A: 106(70)_LYS, ASN_A: 295(259)_LYS,DTSRHYLPLSSILDTLDVMAYNK GLU_A: 447(411)_LYS, ASP_A: 492(456)_LYS,LNVFHWHLVDDPSFPYESFTFPE ASN_A: 518(482)_LYS LMRKGSYNPVTHIYTAQDVKEVIEYARLRGIRVLAEFDTPGHTLSW GPGIPGLLTPCYSGSEPSGTFGP VNPSLNNTYEFMSTFFLEVSSVFPDFYLHLGGDEVDFTCWKSNPEI QDFMRKKGFGEDFKQLESFYIQT LLDIVSSYGKGYVVWQEVFDNKVKIQPDTIIQVWREDIPVNYMKEL ELVTKAGFRALLSAPWYLNRISY GPDWKDFYVVEPLAFEGTPEQKALVIGGEACMWGEYVDNTNLVPRL WPRAGAVAERLWSNKLTSDLTFA YERLSHFRCELLRRGVQAQPLNVGFCEQEFEQ (SEQ ID NO: 47) Hexosaminidase A Chain A:  and B(2gjx)LWPWPQNFQTSDQRYVLYPNNFQ ASP_B: 317(245)_LYS, ASP_A: 123(87)_LYS,FQYDVSSAAQPGCSVLDEAFQRY ASP_B: 518(446)_LYS, ASP_C: 317(246)_LYS,RDLLFGTLEKNVLVVSVVTPGCN GLN_C: 475(404)_LYS, GLU_A: 111(75)_LYS,QLPTLESVENYTLTINDDQCLLL GLN_B: 475(403)_LYS, ASP_C: 518(447)_LYS,SETVWGALRGLETFSQLVWKSAE GLU_D: 111(75)_LYS, GLN_D: 528(492)_LYS,GTFFINKTEIEDFPRFPHRGLLL ASP_A: 34(12)_LYS, GLN_A: 528(492)_LYS,DTSRHYLPLSSILDTLDVMAYNK ASN_B: 327(255)_LYS, GLN_B: 373(301)_LYS,LNVFHWHLVDDPSFPYESFTFPE ASP_B: 523(451)_LYS LMRKGSYNPVTHIYTAQDVKEVIEYARLRGIRVLAEFDTPGHTLSW GPGIPGLLTPCYSGSEPSGTFGP VNPSLNNTYEFMSTFFLEVSSVFPDFYLHLGGDEVDFTCWKSNPEI QDFMRKKGFGEDFKQLESFYIQT LLDIVSSYGKGYVVWQEVFDNKVKIQPDTIIQVWREDIPVNYMKEL ELVTKAGFRALLSAPWYLNRISY GPDWKDFYVVEPLAFEGTPEQKALVIGGEACMWGEYVDNTNLVPRL WPRAGAVAERLWSNKLTSDLTFA YERLSHFRCELLRRGVQAQPLNVGFCEQEFEQ (SEQ ID NO: 48) Chain B:  PALWPLPLSVKMTPNLLHLAPENFYISHSPNSTAGPSCTLLEEAFR RYHGYIFGTQVQQLLVSITLQSE CDAFPNISSDESYTLLVKEPVAVLKANRVWGALRGLETFSQLVYQD SYGTFTINESTIIDSPRFSHRGI LIDTSRHYLPVKIILKTLDAMAFNKFNVLHWHIVDDQSFPYQSITF PELSNKGSYSLSHVYTPNDVRMV IEYARLRGIRVLPEFDTPGHTLSWGKGQKDLLTPCYSDSFGPINPT LNTTYSFLTTFFKEISEVFPDQF IHLGGDEVEFKCWESNPKIQDFMRQKGFGTDFKKLESFYIQKVLDI IATINKGSIVWQEVFDDKAKLAP GTIVEVWKDSAYPEELSRVTASGFPVILSAPWYLDLISYGQDWRKY YKVEPLDFGGTQKQKQLFIGGEA CLWGEYVDATNLTPRLWPRASAVGERLWSSKDVRDMDDAYDRLTRH RCRMVERGIAAQPLYAGYCN (SEQ ID NO: 49) Chain C: PALWPLPLSVKMTPNLLHLAPEN FYISHSPNSTAGPSCTLLEEAFR RYHGYIFGTQVQQLLVSITLQSECDAFPNISSDESYTLLVKEPVAV LKANRVWGALRGLETFSQLVYQD SYGTFTINESTIIDSPRFSHRGILIDTSRHYLPVKIILKTLDAMAF NKFNVLHWHIVDDQSFPYQSITF PELSNKGSYSLSHVYTPNDVRMVIEYARLRGIRVLPEFDTPGHTLS WGKGQKDLLTPCYSLDSFGPINP TLNTTYSFLTTFFKEISEVFPDQFIHLGGDEVEFKCWESNPKIQDF MRQKGFGTDFKKLESFYIQKVLD IIATINKGSIVWQEVFDDKAKLAPGTIVEVWKDSAYPEELSRVTAS GFPVILSAPWYLDLISYGQDWRK YYKVEPLDFGGTQKQKQLFIGGEACLWGEYVDATNLTPRLWPRASA VGERLWSSKDVRDMDDAYDRLTR HRCRMVERGIAAQPLYAGYCN(SEQ ID NO: 50) Chain D:  LWPWPQNFQTSDQRYVLYPNNFQFQYDVSSAAQPGCSVLDEAFQRY RDLLFGTLEKNVLVVSVVTPGCN QLPTLESVENYTLTINDDQCLLLSETVWGALRGLETFSQLVWKSAE GTFFINKTEIEDFPRFPHRGLLL DTSRHYLPLSSILDTLDVMAYNKLNVFHWHLVDDPSFPYESFTFPE LMRKGSYNPVTHIYTAQDVKEVI EYARLRGIRVLAEFDTPGHTLSWGPGIPGLLTPCYSGSEPSGTFGP VNPSLNNTYEFMSTFFLEVSSVF PDFYLHLGGDEVDFTCWKSNPEIQDFMFGEDFKQLESFYIQTLLDI VSSYGKGYVVWQEVFDNKVKIQP DTIIQVWREDIPVNYMKELELVTKAGFRALLSAPWYLNRISYGPDW KDFYVVEPLAFEGTPEQKALVIG GEACMWGEYVDNTNLVPRLWPRAGAVAERLWSNKLTSDLTFAYERL SHFRCELLRRGVQAQPLNVGFCE QEFEQ (SEQ ID NO: 51)SMPD1 gene product NPC1 and NPC2 (transmembrane proteins) ASAH1 (N-acylsphingosine amidohydrolase (acid ceramidase) 1) alpha-glucosidasephenylalanine Chain A:  hydroxylase (PAH) VPWFPRTIQELDRFANQILSYGAASP_A: 338(221)_LYS, GLU_A: 360(243)_LYS, (1j8u) ELDADHPGFKDPVYRARRKQFADASN_A: 376(259)_LYS, GUT_A: 381(264)_LYS, IAYNYRHGQPIPRVEYMEEEKKTGLN_A: 172(55)_LYS, GLU_A: 316(199)_LYS, WGTVFKTLKSLYKTHACYEYNHIASN_A: 133(16)_LY S, ASP_A: 151(34)_LYS, FPLLEKYCGFHEDNIPQLEDVSQASN_A: 167(50)_LYS, GLU_A: 178(61)_LYS, FLQTCTGFRLRPVAGLLSSRDFLASP_A: 145(28)_LYS, GLU_A: 181(64)_LYS, GGLAFRVFHCTQYIRHGSKPMYTGLN_A: 134(17)_LYS, ASP_A: 143(26)_LYS, PEPDICHELLGHVPLFSDRSFAQGLU_A: 182(65)_LYS FSQEIGLASLGAPDEYIEKLATI YWFTVEFGLCKQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELE KTAIQNYTVTEFQPLYYVAESFN DAKEKVRNFAATIPRPFSVRYDPYTQRIEVL (SEQ ID NO: 52) Cathepsin A Chain A:  APDQDEIQRLPGLAKQPSFRQYSGLN_A: 215(215)_LYS, ASN_A: 216(216)_LYS, GYLKSSGSKHLHYWFVESQKDPEGLN_A: 327(327)_LYS, ASP_A: 404(404)_LYS, NSPVVLWLNGGPGCSSLDGLLTEASP_A: 3(3)_LYS, ASP_A: 111(111)_LYS, HGPFLVQPDGVTLEYNPYSWNLIGLN_A: 394(394)_LYS, GLN_A: 450(450)_LYS, ANVLYLESPAGVGFSYSDDKFYAASP_A: 110(110)_LYS, GLN_A: 165(165)_LYS, TNDTEVAQSNFEALQDFFRLFPEASP_A: 266(266)_LYS, GLN_A: 288(288)_LYS, YKNNKLFLTGESYAGIYIPTLAVGLU_A: 326(326)_LYS, ASN_A: 388(388)_LYS, LVMQDPSMNLQGLAVGNGLSSYEASN_A: 448(448)_LYS QNDNSLVYFAYYHGLLGNRLWSS LQTHCCSQNKCNFYDNKDLECVTNLQEVARIVGNSGLNIYNLYAPC AGGVPSHFRYEKDTVVVQDLGNI FTRLPLKRMWHQALLRSGDKVRMDPPCTNTTAASTYLNNPYVRKAL NIPEQLPQWDMCNFLVNLQYRRL YRSMNSQYLKLLSSQKYQILLYNGDVDMACNFMGDEWFVDSLNQKM EVQRRPWLVKYGDSGEQIAGFVK EFSHIAFLTIKGAGHMVPTDKPLAAFTMFSRFLNKQPY (SEQ ID NO: 53) STRUCTURAL PROTEINS Collagen ElastinActin (1lot) Chain B:  DETTALVCDNGSGLVKAGFAGDDASP_B: 3(1)_LYS, GLU_B: 4(2)_LYS, APRAVFPSIVGRPRDSYVGDEAQASP_B: 244(230)_LYS, ASP_B: 51(38)_LYS, SKRGILTLKYPIEGIITNWDDMEASP_B: 288(274)_LYS, GLN_B: 246(232)_LYS, KIWHHTFYNELRVAPEEHPTLLTGLU_B: 167(153)_LYS, ASP_B: 286(272)_LYS, EAPLNPKANREKMTQIMFETFNVGLN_B: 354(340)_LYS, ASP_B: 80(66)_LYS, PAMYVAIQAVLSLYASGRTTGIVASP_B: 222(208)_LYS, GLU_B: 224(210)_LYS, LDSGDGVTHNVPIYEGYALPHAIGLU_B: 270(256)_LYS, GLU_B: 364(350)_LYS, MRLDLAGRDLTDYLMKILTERGYGLU_B: 195(181)_LYS SFVTTAEREIVRDIKEKLCYVAL DFENEMATAASSSSLEKSYELPDGQVITIGNERFRCPETLFQPSFI GMESAGIHETTYNSIMKCDIDIR KDLYANNVMSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERK YSVWIGGSILASLSTFQQMWITK QEYDEAGPSIVHRK(SEQ ID NO: 54) Tubilin (3cb2) Chain A:  PREIITLQLGQCGNQIGFEFWKQASP_A: 310(303)_LYS, GLU_A: 43(42)_LYS, LCAEHGISPEAIVEEFATEGTDRASP_A: 56(55)_LYS, ASP_A: 57(56)_LYS, KDVFFYQADDEHYIPRAVLLDLEGLU_A: 39(38)_LYS, GLU_A: 177(176)_LYS, PRVIHSILNSPYAKLYNPENIYLASP_A: 180(179)_LYS, GLU_B: 95(93)_LYS, SEHGGGAGNNWASGFSQGEKIHEASP_ B: 57(55)_LYS, ASP_B: 130(126)_LYS, DIFDIIDREADGSDSLEGFVLCHASP_B: 176(172)_LYS, ASN_A: 79(78)_LYS, SIAGGTGSGLGSYLLERLNDRYPASP_A: 127(126)_LYS, ASP_A: 130(129)_LYS, KKLVQTYSVFPNQDEMSDVVVQPASP_A: 216(215)_LYS YNSLLTLKRLTQNADCLVVLDNT ALNRIATDRLHIQNPSFSQINQLVSTIMSASTTTLRYPGYMNNDLI GLIASLIPTPRLHFLMTGYTPLT SVRKTTVLDVMRRLLQPKNVMVSTGRDTNHCYIAILNIIQGEVDPT QVHKSLQRIRERKLANFIPWGPA SIQVALSRKSPYRVSGLMMANHTSISSLFERTCRQYDKLRKREAFL EQFRKEDMFKDNFDEMDTSREIV QQLIDEYHAATRPDYISW(SEQ ID NO: 55) Chain B:  REIITLQLGQCGNQIGFEFWKQLCAEHGISPEAIVEEFATEGTDRK DVFFYQADDEHYIPRAVLLDLEP RVIHSILNSPYAKLYNPENIYLSEHGAGNNWASGFSQGEKIHEDIF DIIDREADGSDSLEGFVLCHSIA GGTGSGLGSYLLERLNDRYPKKLVQTYSVFPNQDEMSDVVVQPYNS LLTLKRLTQNADCLVVLDNTALN RIATDRLHIQNPSFSQINQLVSTIMSASTTTLRYPGYMNNDLIGLI ASLIPTPRLHFLMTGYTPLTKTT VLDVMRRLLQPKNVMVSTTNHCYIAILNIIQGEVDPTQVHKSLQRI RERLANFIPWGPASIQVALSRKS PYLPRVSGLMMANHTSISSLFERTCRQYDKLRKREAFLEQFRKEDM FKDNFDEMDTSREIVQQLIDEYH AATRPDYISW(SEQ ID NO: 56) Keratin Myosin (2fxo) Chain A:  GSSPLLKSAEREKEMASMKEEFTGLU_A: 844(10)_LYS, GLU_A: 854(20)_LYS, RLKEALEKSEARRKELEEKMVSLGLU_B: 854(18)_LYS, GLN_B: 882(46)_LYS, LQEKNDLQLQVQAEQDNLADAEEASP_B: 956(120)_LYS, GLN_D: 882(46)_LYS, RCDQLIKNKIQLEAKVKEMNKRLGLU_A: 848(14)_LYS, GLU_A: 875(41)_LYS, EDEEEMNAELTAKKRKLEDECSEGLN_A: 882(48)_LYS, GLN_A: 914(80)_LYS, LKRDIDDLELTLAKGLU_A: 921(87)_LYS, ASP_A: 956(122)_LYS, (SEQ ID NO: 57)GLU_B: 848(12)_LYS, GLU_B: 864(28)_LYS, Chain B:  GLU_B: 875(39)_LYSSPLLKSAEREKEMASMKEEFTRL KEALEKSEARRKELEEKMVSLLQ EKNDLQLQVQAEQDNLADAEERCDQLIKNKIQLEAKVKEMNKRLED EEEMNAELTAKKRKLEDECSELK RDIDDLELTL(SEQ ID NO: 58) Chain C:  SSPLLKSAEREKEMASMKEEFTRLKEALEKSEARRKELEEKMVSLL QEKNDLQLQVQAEQDNLADAEER CDQLIKNKIQLEAKVKEMNKRLEDEEEMNAELTAKKRKLEDECSEL KRDIDDLELTLA (SEQ ID NO: 59) Chain D: SPLLKSAEREKEMASMKEEFTRL KEALEKSEARRKELEEKMVSLLQ EKNDLQLQVQAEQDNLADAEERCDQLIKNKIQLEAKVKEMNKRLED EEEMNAELTAKKRKLEDECSELK RDIDDLELTLAK(SEQ ID NO: 60) EXTRACELLUL. PROTEINS Cytokines Colony StimulatingFactors G-CSF Chain A:  LPQSFLLKCLEQVRKIQGDGAALGLU_A: 123(106)_LYS, GLU_A: 122(105)_LYS, QEKLCATYKLCHPEELVLLGHSLGLN_A: 11(3)_LYS, GLU_A: 45(37)_LYS, GIPWAPLLAGCLSQLHSGLFLYQGLU_A: 46(38)_LYS, GLU_A: 98(81)_LYS, GLLQALEGISPELGPTLDTLQLDGLU_A: 19(11)_LYS, GLN_A: 119(102)_LYS, VADFATTIWQQMEELGMMPAFASASP_A: 112(95)_LYS, GLN_A: 77(60)_LYS, AFQRRAGGVLVASHLQSFLEVSYGLU_A: 33(25)_LYS, GLN_A: 90(73)_LYS, RVLRHLAGLU_A: 93(76)_LYS, ASP_A: 104(87)_LYS. (SEQ ID NO: 61)GLU_A: 162(135)_LYS GM-CSF Chain B:  EHVNAIQEARRLLNLSRDTAAEMGLN_B: 50(37)_LYS, GLU_B: 14(1)_LYS, NETVEVISEMFDLQEPTCLQTRLGLU_B: 51(38)_LYS, GLN_B: 86(73)_LYS, ELYKQGLRGSLTKLKGPLTMMASASN_B: 27(14)_LYS, ASP_B: 48(35)_LYS, HYKQHCPPTPETSCATQIITFESASN_B: 17(4)_LYS, ASP_B: 31(18)_LYS, FKENLKDFLLVIPGLU_B: 93(80)_LYS, GLN_B: 99(86)_LYS, (SEQ ID NO: 62)GLU_B: 21(8)_LYS, ASN_B: 37(24)_LYS,GLU_B: 45(32)_LYS, GLN_B: 64(51)_LYS, GLU_B: 108(95)_LYS InterferonsInterferon alfa-2 Chain B:  CDLPQTHSLGSRRTLMLLAQMRKLU_B: 165(165)_LYS, GLN_B: 5(5)_LYS, ISLFSCLKDRHDFGFPQEEFGNQGLU_B: 107(107)_LYS, GLN_B: 46(46)_LYS, FQKAETIPVLHEMIQQIFNLFSTGLN_B: 101(101)_LYS, ASN_B: 45(45)_LYS, KDSSAAWDETLLDKFYTELYQQLASN_B: 65(65)_LYS, GLU_B: 132(132)_LYS, NDLEACVIQGVGVTETPLMKEDSGLU_B: 159(159)_LYS, GLU_B: 41(41)_LYS, ILAVRKYFQRITLYLKEKKYSPCASP_B: 82(82)_LYS, ASP_B: 2(2)_LYS, AWEVVRAEIMRSFSLSTNLQESLGLN_B: 20(20)_LYS, ASP_B: 35(35)_LYS, RSKE ASP_B: 71(71)_LYS(SEQ ID NO: 63) Interferon beta-1 Chain A:  MSYNLLGFLQRSSNFQCQKLLWQASP_A: 110(110)_LYS, GLU_A: 29(29)_LYS, LNGRLEYCLKDRMNFDIPEEIKQASN_A: 37(37)_LYS, GLU_A: 42(42)_LYS, LQQFQKEDAALTIYEMLQNIFAIGLU_A: 109(109)_LYS, GLN_A: 46(46)_LYS, FRQDSSSTGWNETIVENLLANVYGLN_A: 48(48)_LYS, GLN_A: 49(49)_LYS, HQINHLKTVLEEKLEKEDFTRGKGLU_A: 103(103)_LYS, GLU_A: 107(107)_LYS, LMSSLHLKRYYGRILHYLKAKEYASP_A: 39(39)_LYS, GLN_A: 51(51)_LYS, SHCAWTIVRVEILRNFYFINRLTGLU_A: 104(104)_LYS, ASN_A: 166(166)_LYS, GYLRN GLN_A: 23(23)_LYS(SEQ ID NO: 64) Interferon gamma-1b Chain A:  MQDPYVKEAENLKKYFNAGHSDVASN_A: 225(143)_LYS, ASP_A: 224(142)_LYS, ADNGTLFLGILKNWKEESDRKIMGLN_A: 1(2)_LYS, ASP_A: 2(3)_LYS, QSQIVSFYFKLFKNFKDDQSIQKGLN_A: 64(65)_LYS, GLU_A: 238(156)_LYS, SVETIKEDMNVKFFNSNKKKRDDGLN_A: 264(182)_LYS, ASP_A: 24(25)_LYS, FEKLTNYSVTDLNVQRKAIDELIASN_A: 25(26)_LYS, ASP_A: 102(103)_LYS, QVMAELGANVSGEFVKEAENLKKASN_A: 297(215)_LYS, ASP_A: 302(220)_LYS, YFNDNGTLFLGILKNWKEESDRKGLU_A: 38(39)_LYS, ASN_A: 59(60)_LYS, IMQSQIVSFYFKLFKNFKDDQSIASP_A: 63(64)_LYS QKSVETIKEDMNVKFFNSNKKKR DDFEKLTNYSVTDLNVQRKAIHELIQVMAELSPAA (SEQ ID NO: 65) Interleukins IL-2 (1M47) Chain A: STKKTQLQLEHLLLDLQMILNGI ASN_A: 77(70)_LYS, ASN_A: 33(28)_LYS,NNYKNPKLTRMLTFKFYMPKKAT ASP_A: 109(98)_LYS, GLN_A: 74(69)_LYS,ELKHLQCLEEELKPLEEVLNLAQ ASP_A: 84(77)_LYS, GLU_A: 95(88)_LYS,NFHLRPRDLISNINVIVLELKGF GLU_A: 110(99)_LYS, ASN_A: 26(21)_LYS,MCEYADETATIVEFLNRWITFCQ ASN_A: 29(24)_LYS, ASN_A: 30(25)_LYS, SIISTLTGLU_A: 52(47)_LYS, GLU_A: 68(63)_LYS, (SEQ ID NO: 66)ASN_A: 71(66)_LYS, GLU_A: 61(56)_LYS, GLU_A: 62(57)_LYS IL-1 receptorChain A:  antagonist (1irb) ALWQFNGMIKCKIPSSEPLLDFNASN_A: 79(79)_LYS, GLU_A: 114(114)_LYS, NYGCYCGLGGSGTPVDDLDRCCQASP_A: 59(59)_LYS, GLU_A: 87(87)_LYS, THDNCYKQAKKLDSCKVLVDNPYASP_A: 21(21)_LYS, ASN_A: 50(50)_LYS, TNNYSYSCSNNEITCSSENNACEASP_A: 66(66)_LYS, GLU_A: 81(81)_LYS, AFICNCDRNAAICFSKVPYNKEHASP_A: 119(119)_LYS, ASN_A: 122(122)_LYS, KNLDAANCASN_A: 80(80)_LYS, ASN_A: 89(89)_LYS, (SEQ ID NO: 67)ASN_A: 112(112)_LYS, GLU_A: 17(17)_LYS, GLN_A: 54(54)_LYS IL-1 (2nvh)Chain A:  APVRSLNCTLRDSQQKSLVMSGP GLN_A: 34(34)_LYS, ASN_A: 53(53)_LYS,YELKALHLQGQDMEQQVVFSMSF ASP_A: 75(75)_LYS, ASP_A: 76(76)_LYS,VQGEESNDKIPVALGLKEKNLYL ASN_A: 107(107)_LYS, ASN_A: 89(89)_LYS,SCVLKDDKPTLQLESVDPKNYPK ASN_A: 108(108)_LYS, ASP_A: 35(35)_LYS,KKMEKRFVFNKIEINNKLEFESA ASP_A: 86(86)_LYS, GLU_A: 50(50)_LYS,QFPNWYISTSQAENMPVFLGGTK GLN_A: 141(141)_LYS, GLN_A: 32(32)_LYS,GGQDITDFTMQFVS GLU_A: 37(37)_LYS, ASP_A: 54(54)_LYS, (SEQ ID NO: 68)GLU_A: 64(64)_LYS Ciliary neurotrophic Chain 1:  factor (CNTF) (lent)PHRRDLCSRSIWLARKIRSDLTA GLU_4: 66(34)_LYS, GLU_1: 66(37)_LYS,LTESYVKHQGLWSELTEAERLQE GLU_1: 153(116)_LYS, ASN_4: 137(99)_LYS,NLQAYRTFHVLLARLLEDQQVHF ASP_1: 104(75)_LYS, GLU_1: 131(102)_LYS,TPTEGDFHQAIHTLLLQVAAFAY GLU_1: 138(109)_LYS, GLU_4: 71(39)_LYS,QIEELMILLEYKIPRNEADGMLF ASP_1: 140(111)_LYS, GLU_1: 164(127)_LYS,EKKLWGLKVLQELSQWTVRSIHD GLN_1: 167(130)_LYS, GLU_4: 131(93)_LYS,LRFISSHQTGIP ASP_1: 15(5)_LYS, GLU_1: 36(26)_LYS, (SEQ ID NO: 69)ASN_1: 137(108)_LYS Chain 4:  HRRDLCSRSIWLARKIRSDLTALTESYVKHQGLELTEAERLQENLQ AYRTFHVLLARLLEDQQEGDFHQ AIHTLLLQVAAFAYQIEELMILLEYKIPRNKKLWGLKVLQELSQWT VRSIHDLRFIS (SEQ ID NO: 70) TNFsTNF-alpha (4tsv) Chain A:  DKPVAHVVANPQAEGQLQWSNRRASP_A: 10(1)_LYS, GLU_A: 107(98)_LYS, ANALLANGVELRDNQLVVPIEGLGLN_A: 21(12)_LYS, GLN_A: 102(93)_LYS, FLIYSQVLFKGQGCPSTHVLLTHGLU_A: 146(137)_LYS, ASN_A: 34(25)_LYS, TISRIAVSYQTKVNLLSAIKSPCGLU_A: 23(14)_LYS, ASP_A: 45(36)_LYS, QRETPEGAEAKPWYEPIYLGGVFGLN_A: 88(79)_LYS, GLN_A: 125(116)_LYS, QLEKGDRLSAEINRPDYLDFAESASN_A: 39(30)_LYS, GLN_A: 67(58)_LYS, GQVYFGIIALGLU_A: 110(101)_LYS, GLU_A: 53(44)_LYS, (SEQ ID NO: 71)ASN_A: 92(83)_LYS TNF-beta Chain A:  (lymphotoxin) (1tnr)KPAAHLIGDPSKQNSLLWRANTD GLN_A: 107(80)_LYS, ASP_A: 50(23)_LYS,RAFLQDGFSLSNNSLLVPTSGIY ASN_A: 62(35)_LYS, GLU_A: 127(100)_LYS,FVYSQVVFSGKAYSPKATSSPLY GLN_A: 140(113)_LYS, ASN_A: 41(14)_LYS,LAHEVQLFSSQYPFHVPLLSSQK ASP_A: 56(29)_LYS, ASN_A: 48(21)_LYS,MVYPGLQEPWLHSMYHGAAFQLT GLN_A: 55(28)_LYS, GLN_A: 118(91)_LYS,QGDQLSTHTDGIPHLVLSPSTVF GLN_A: 40(13)_LYS, GLN_A: 143(116)_LYS, FGAFALGLN_A: 126(99)_LYS, ASP_A: 152(125)_LYS, (SEQ ID NO: 72)ASN_A: 63(36)_LYS Peptide Hormones Erythropoietin Chain A: ASP_A: 165(165)_LYS, GLU_A: 89(89)_LYS, APPRLICDSRVLERYLLEAKEAEGLU_A: 31(31)_LYS, ASP_A: 123(123)_LYS, KITTGCAEHCSLNEKITVPDTKVASN_A: 47(47)_LYS, GLU_A: 55(55)_LYS, NFYAWKRMEVGQQAVEVWQGLALGLN_A: 86(86)_LYS, ASN_A: 36(36)_LYS, LSEAVLRGQALLVKSSQPWEPLQGLU_A: 37(37)_LYS, GLU_A: 159(159)_LYS, LHVDKAVSGLRSLTTLLRALGAQASP_A: 8(8)_LYS, GLN_A: 92(92)_LYS, KEAISNSDAASAAPLRTITADTFASP_A: 96(96)_LYS, GLU_A: 13(13)_LYS, RKLFRVYSNFLRGKLKLYTGEACGLU_A: 21(21)_LYS RTGDR (SEQ ID NO: 73) Insulin Chain A: GIVEQCCTSICSLYQLENYCN ASN_B: 3(3)_LYS, GLU_B: 13(13)_LYS,(SEQ ID NO: 74) GLU_B: 21(21)_LYS, GLU_A: 4(4)_LYS, Chain B: GLN_A: 5(5)_LYS, ASN_A: 21(21)_LYS, FVNQHLCGSHLVEALYLVCGERGGLN_A: 15(15)_LYS, ASN_A: 18(18)_LYS, FFYTPKGLN_B: 4(4)_LYS, GLU_A: 17(17)_LYS (SEQ ID NO: 75) Growth hormoneChain A:  (GH)(Somatotropin) FPTIPLSRLADNAWLRADRLNQLGLU_A: 129(129)_LYS, GLU_A: 39(39)_LYS, (lhuw) AFDTYQEFEEAYIPKEQIHSFWWASN_A: 47(47)_LYS, ASN_A: 63(63)_LYS, NPQTSLCPSESIPTPSNKEETQQGLU_A: 65(65)_LYS, GLU_A: 66(66)_LYS, KSNLELLRISLLLIQSWLEPVQFGLU_A: 88(88)_LYS, GLN_A: 40(40)_LYS, LRSVFANSLVYGASDSNVYDLLKGLN_A: 69(69)_LYS, ASP_A: 107(107)_LYS, DLEEGIQTLMGRLEALLKNYGLLASP_A: 112(112)_LYS, GLU_A: 33(33)_LYS, YCFNKDMSKVSTYLRTVQCRSVEGLN_A: 91(91)_LYS, ASN_A: 99(99)_LYS, GSCGF ASP_A: 116(116)_LYS(SEQ ID NO: 76) Follicle-stimulating Chain C:  hormone (FSH)CHHRICHCSNRVFLCQESKVTEI ASP_C: 43(26)_LYS, ASN_C: 27(10)_LYS,PSDLPRNAIELRFVLTKLRVIQK ASN_C: 47(30)_LYS, ASN_C: 112(95)_LYS,GAFSGFGDLEKIEISQNDVLEVI ASN_C: 251(234)_LYS, GLU_C: 259(242)_LYS,EADVFSNLPKLHEIRIEKANNLL GLU_C: 34(17)_LYS, GLU_C: 239(222)_LYS,YINPEAFQNLPNLQYLLISNTGI ASN_C: 240(223)_LYS, GLU_C: 39(22)_LYS,KHLPDVHKIHSLQKVLLDIQDNI ASP_C: 71(54)_LYS, ASN_C: 205(188)_LYS,NIHTIERNSFVGLSFESVILWLN GLU_C: 207(190)_LYS, ASN_C: 211(194)_LYS,KNGIQEIHNCAFNGTQLDELNLS GLU_C: 76(59)_LYS DNNNLEELPNDVFHGASGPVILDISRTRIHSLPSYGLENLKKLRAR STYNLKKLPTLE (SEQ ID NO: 77) Gonadotropin-releasing hormone (GnRH) Thyrotropin-releasing hormone(TRH)somatostatin(growth- hormone-inhibiting hormone Leptin(1ax8) Chain A: IQKVQDDTKTLIKTIVTRINDIL GLN_A: 4(2)_LYS, ASP_A: 23(21)_LYS,DFIPGLHPILTLSKMDQTLAVYQ ASP_A: 40(24)_LYS, GLU_A: 105(89)_LYS,QILTSMPSRNVIQISNDLENLRD ASP_A: 108(92)_LYS, GLU_A: 100(84)_LYS,LLHVLAFSKSCHLPEASGLETLD ASP_A: 8(6)_LYS, ASN_A: 22(20)_LYS,SLGGVLEASGYSTEVVALSRLQG ASP_A: 141(125)_LYS, ASN_A: 78(62)_LYS,SLQDMLWQLDLSPGC ASP_A: 9(7)_LYS, GLN_A: 73(59)_LYS, (SEQ ID NO: 78)ASP_A: 85(69)_LYS, ASN_A: 72(56)_LYS, GLU_A: 81(65)_LYS Growth-hormone-releasing hormone (GHRH) Chain 1:  Insulin-like growthPETLCGAELVDALQFVCGDRGFY GLU_I: 3(2)_LYS, ASP_I: 20(19)_LYS, factor(orFNKPTGYGSSSRRAPQTGIVDEC GLU_I: 9(8)_LYS, ASP_I: 12(11)_LYS,somatomedin)(1wqi) CFRSCDLRRLEMYCAPASN_I: 26(25)_LYS, GLN_I: 40(39)_LYS, (SEQ ID NO: 79)ASP_I: 153(52)_LYS, ASP_I: 45(44)_LYS,GLU_I: 58(57)_LYS, GLN_I: 15(14)_LYS, GLU_I: 46(45)_LYS Antimullerianhormone (or mullerian inhibiting factor or hormone) Adiponectin (1c28)Chain A:  MYRSAFSVGLETRVTVPNVPIRFASP_C: 173(55)_LYS, GLN_B: 191(72)_LYS, TKIFYNQQNHYDGSTGKFYCNIPGLU_A: 194(82)_LYS, ASP_A: 182(70)_LYS, GLYYFSYHITVYMKDVKVSLFKKGLN_B: 193(74)_LYS, GLN_A: 143(31)_LYS, DKAVLFTYDQYQENVDQASGSVLASN_B: 130(12)_LYS, GLN_B: 143(25)_LYS, LHLEVGDQVWLQVYYADNVNDSTASP_B: 182(64)_LYS, ASP_B: 190(71)_LYS, FTGFLLYHDTGLN_C: 143(28)_LYS, ASP_C: 182(64)_LYS, (SEQ ID NO: 80)ASP_B: 173(55)_LYS, ASP_B: 245(111)_LYS, Chain B:  ASN_A: I44(32)_LYSMYRSAFSVGLPNVPIRFTKIFYN QQNHYDGSTGKFYCNIPGLYYFS YHITVYMKDVKVSLFKKDKVLFTYDQYQEKVDQASGSVLLHLEVGD QVWLQVYDSTFTGFLLYHD (SEQ ID NO: 81) Chain C: MYRSAFSVGLETRVTVPIRFTKI FYNQQNHYDGSTGKFYCNIPGLY YFSYHITVDVKVSLFKKDKAVLFTQASGSVLLHLEVGDQVWLQNDS TFTGFLLYHD (SEQ ID NO: 82) Adrenocorticotropichormone (or corticotropin) Angiotensinogen and angiotensinAntidiuretic hormone (or vasopressin, arginine vasopressin)Atrial-natriuretic peptide (or atriopeptin) B-type natriureticpeptide (BNP) Calcitonin Cholecystokinin Corticotropin-releasing hormone Gastrin Luteinizing hormone (LH) Coagulation FactorsChain A:  Factor VIII (aka ATRRYYLGAVELSWDYMQSDLGEGLN_A: 334(327)_LYS, ASN_A: 214(214)_LYS, antihemophilicLPVDARFPPRVPKSFPFNTSVVY ASP_A: 361(329)_LYS, ASP_A: 27(27)_LYS,factor) (2r7e) KKTLFVEFTDHLFNIAKPRPPWMGLU_A: 211(211)_LYS, GLU_A: 331(324)_LYS, GLLGPTIQAEVYDTVVITLKNMAGLU_A: 332(325)_LYS, ASP_A: 363(331)_LYS, SHPVSLHAVGVSYWKASEGAEYDASN_A: 714(682)_LYS, ASN_A: 41(41)_LYS, DQTSQREKEDDKVFPGGSHTYVWASP_A: 362(330)_LYS, ASN_A: 364(332)_LYS, QVLKENGPMASDPLCLTYSYLSHGLU_A: 720(688)_LYS, GLN_B: 1692(4)_LYS, VDLVKDLNSGLIGALLVCREGSLASP_A: 403(371)_LYS AKEKTQTLHKFILLFAVFDEGKS WHSETKNAASARAWPKMHTVNGYVNRSLPGLIGCHRKSVYWHVIGM GTTPEVHSIFLEGHTFLVRNHRQ ASLEISPITFLTAQTLLMDLGQFLLFCHISSHQHDGMEAYVKVDSC PEEPQFDDDNSPSFIQIRSVAKK HPKTWVHYIAAEEEDWDYAPLVLAPDDRSYKSQYLNNGPQRIGRKY KKVRFMAYTDETFKTREAIQHES GILGPLLYGEVGDTLLIIFKNQASRPYNIYPHGITDVRPLYSRRLP KGVKHLKDFPILPGEIFKYKWTV TVEDGPTKSDPRCLTRYYSSFVNMERDLASGLIGPLLICYKESVDQ RGNQIMSDKRNVILFSVFDENRS WYLTENIQRFLPNPAGVQLEDPEFQASNIMHSINGYVFDSLQLSVC LHEVAYWYILSIGAQTDFLSVFF SGYTFKHKMVYEDTLTLFPFSGETVFMSMENPGLWILGCHNSDFRN RGMTALLKVSSCDKNTGDYYEDS YED (SEQ ID NO: 83)Chain B:  RSFQKKTRHYFIAAVERLWDYGM SSSPHVLRNRAQSGSVPQFKKVVFQEFTDGSFTQPLYRGELNEHLG LLGPYIRAEVEDNIMVTFRNQAS RPYSFYSSLISYEEDQRQGAEPRKNFVKPNETKTYFWKVQHHMAPT KDEFDCKAWAYSSDVDLEKDVHS GLIGPLLVCHTNTLNPAHGRQVTVQEFALFFTIFDETKSWYFTENM ERNCRAPCNIQMEDPTFKENYRF HAINGYIMDTLPGLVMAQDQRIRWYLLSMGSNENIHSIHFSGHVFT VRKKEEYKMALYNLYPGVFETVE MLPSKAGIWRVECLIGEHLHAGMSTLFLVYSNKCQTPLGMASGHIR DFQITASGQYGQWAPKLARLHYS GSINAWSTKEPFSWIKVDLLAPMIIHGIKTQGARQKFSSLYISQFI IMYSLDGKKWQTYRGNSTGTLMV FFGNVDSSGIKHNIFNPPIIARYIRLHPTHYSIRSTLRMELMGCDL NSCSMPLGMESKAISDAQITASS YFTNMFATWSPSKARLHLQGRSNAWRPQVNNPKEWLQVDFQKTMKV TGVTTQGVKSLLTSMYVKEFLIS SSQDGHQWTLFFQNGKVKVFQGNQDSFTPVVNSLDPPLLTRYLRIH PQSWVHQIALRMEVLGCEAQDLY (SEQ ID NO: 84) OtherChain A:  Human serum SEVAHRFKDLGEENFKALVLIAFASP_B: 301(297)_LYS, ASP_A: 301(297)_LYS, albumin (1ao6)AQYLQQCPFEDHVKLVNEVTEFA GLU_A: 505(501)_LYS, GLU_B: 505(501)_LYS,KTCVADESAENCDKSLHTLFGDK GLU_A: 82(78)_LYS, GLU_A: 542(538)_LYS,LCTVATLRETYGEMADCCAKQEP GLU_B: 82(78)_LYS, GLU_B: 542(538)_LYS,ERNECFLQHKDDNPNLPRLVRPE GLU_A: 17(13)_LYS, GLU_A: 37(33)_LYS,VDVMCTAFHDNEETFLKKYLYEI ASP_A: 562(558)_LYS, GLU_B: 17(13)_LYS,ARRHPYFYAPELLFFAKRYKAAF GLU_B: 37(33)_LYS, ASP_B: 375(371)_LYS,TECCQAADKAACLLPKLDELRDE ASP_B: 562(558)_LYS GKASSAKQRLKCASLQKFGERAFKAWAVARLSQRFPKAEFAEVSKL VTDLTKVHTECCHGDLLECADDR ADLAKYICENQDSISSKLKECCEKPLLEKSHCIAEVENDEMPADLP SLAADFVESKDVCKNYAEAKDVF LGMFLYEYARRHPDYSVVLLLRLAKTYETTLEKCCAAADPHECYAK VFDEFKPLVEEPQNLIKQNCELF EQLGEYKFQNALLVRYTKKVPQVSTPTLVEVSRNLGKVGSKCCKHP EAKRMPCAEDYLSVVLNQLCVLH EKTPVSDRVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFH ADICTLSEKERQIKKQTALVELV KHKPKATKEQLKAVMDDFAAFVEKCCKADDKETCFAEEGKKLVAAS QAA (SEQ ID NO: 85) Chain B: SEVAHRFKDLGEENFKALVLIAF AQYLQQCPFEDHVKLVNEVTEFA KTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEP ERNECFLQHKDDNPNLPRLVRPE VDVMCTAFHDNEETFLKKYLYEIARRHPYFYAPELLFFAKRYKAAF TECCQAADKAACLLPKLDELRDE GKASSAKQRLKCASLQKFGERAFKAWAVARLSQRFPKAEFAEVSKL VTDLTKVHTECCHGDLLECADDR ADLAKYICENQDSISSKLKECCEKPLLEKSHCIAEVENDEMPADLP SLAADFVESKDVCKNYAEAKDVF LGMFLYEYARRHPDYSVVLLLRLAKTYETTLEKCCAAADPHECYAK VFDEFKPLVEEPQNLIKQNCELF EQLGEYKFQNALLVRYTKKVPQVSTPTLVEVSRNLGKVGSKCCKHP EAKRMPCAEDYLSVVLNQLCVLH EKTPVSDRVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFH ADICTLSEKERQIKKQTALVELV KHKPKATKEQLKAVMDDFAAFVEKCCKADDKETCFAEEGKKLVAAS QAA (SEQ ID NO: 86) Alpha 1-AntitrypsinChain A:  HPTFNKITPNLAEFAFSLYRQLAGLN_A: 212(193)_LYS, GLU_A: 86(67)_LYS, HQSNSTNIFFSPVSIAAAFAMLSGLU_A: 175(156)_LYS, ASN_A: 278(259)_LYS, LGAKGDTHDEILEGLNFNLTEIPASP_A: 280(261)_LYS, ASN_A: 46(27)_LYS, EAQIHEGFQELLRTLNQPDSQLQGLU_A: 257(238)_LYS, GLU_A: 279(260)_LYS, LTTGNGLFLSEGLKLVDKFLEDVGLN_A: 44(25)_LYS, ASP_A: 270(251)_LYS, KKLYHSEAFTVNFGDTEEAKKQIGLU_A: 277(258)_LYS, GLN_A: 305(286)_LYS, NDYVEKGTQGKIVDLVKELDRDTASN_A: 314(295)_LYS, GLU_A: 346(327)_LYS, VFALVNYIFFKGKWERPFEVKDTGLN_A: 91(72)_LYS EEEDFHVDQVTTVKVPMMKRLGM FNIQHCKKLSSWVLLMKYLGNATAIFFLPDEGKLQHLENELTHDII TKFLENEDRRSASLHLPKLSITG TYDLKSVLGQLGITKVFSNGADLSGVTEEAPLKLSKAVHKAVLTID EKGTEAAGAMFLEAIPMSIPPEV KFNKPFVFLMIEQNTKSPLFMGKVVNPTQK (SEQ ID NO: 87) Hemoglobin(1bz0) Chain A: VLSPADKTNVKAAWGKVGAHAGE GLU_B: 43(43)_LYS, ASN_B: 19(19)_LYS,YGAEALERMFLSFPTTKTYFPHF ASP_A: 75(75)_LYS, GLU_B: 6(6)_LYS,DLSHGSAQVKGHGKKVADALTNA ASP_B: 73(73)_LYS, ASP_A: 47(47)_LYS,VAHVDDMPNALSALSDLHAHKLR GLU_B: 101(101)_LYS, ASN_A: 68(68)_LYS,VDPVNFKLLSHCLLVTLAAHLPA ASP_A: 74(74)_LYS, ASN_A: 78(78)_LYS,EFTPAVHASLDKFLASVSTVLTS ASP_A: 94(94)_LYS, ASP_B: 79(79)_LYS, KYRASP_B: 94(94)_LYS, ASP_B: 99(99)_LYS, (SEQ ID NO: 88)GLU_B: 121(121)_LYS Chain B:  VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFG DLSTPDAVMGNPKVKAHGKKVLG AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLA HHFGKEFTPPVQAAYQKVVAGVA NALAHKYH (SEQ ID NO: 89)

TABLE 2 Exemplary Transcription Factors that can be SuperchargedClassified according to their regulatory function: I.constitutively-active - present in all cells at all times - generaltranscription factors, Sp1, NF1, CCAAT II. conditionally-active -requires activation  II.A developmental (cell specific) - expression istightly controlled, but,   once expressed, require no additionalactivation - GATA, HNF,   PIT-1, MyoD, Myf5, Hox, Winged Helix  II.Bsignal-dependent - requires external signal for activation   II.B.1extracellular ligand-dependent - nuclear receptors   II.B.2intracellular ligand-dependent - activated by small intracellular   molecules - SREBP, p53, orphan nuclear receptors   II.B.3 cellmembrane receptor-dependent - second messenger    signaling cascadesresulting in the phosphorylation of the    transcription factor   II.B.3.a resident nuclear factors - reside in the nucleus regardlessof     activation state - CREB, AP-1, Mef2    II.B.3.b latentcytoplasmic factors - inactive form reside in the     cytoplasm, but,when activated, are translocated into the     nucleus - STAT, R-SMAD,NF-kB, Notch, TUBBY, NFAT Classified based on sequence similarity andhence the tertiary structure of their DNA binding domains: 1 Superclass:Basic Domains (Basic-helix-loop-helix)  1.1 Class: Leucine zipperfactors (bZIP)   1.1.1 Family: AP-1(-like) components; includes(c-Fos/c-Jun)   1.1.2 Family: CREB   1.1.3 Family: C/EBP-like factors  1.1.4 Family: bZIP/PAR   1.1.5 Family: Plant G-box binding factors  1.1.6 Family: ZIP only  1.2 Class: Helix-loop-helix factors (bHLH)  1.2.1 Family: Ubiquitous (class A) factors   1.2.2 Family: Myogenictranscription factors (MyoD)   1.2.3 Family: Achaete-Scute   1.2.4Family: Tal/Twist/Atonal/Hen  1.3 Class: Helix-loop-helix/leucine zipperfactors (bHLH-ZIP)   1.3.1 Family: Ubiquitous bHLH-ZIP factors; includesUSF (USF1,    USF2); SREBP (SREBP)   1.3.2 Family: Cell-cyclecontrolling factors; includes c-Myc  1.4 Class: NF-1   1.4.1 Family:NF-1 (A, B, C, X)  1.5 Class: RF-X   1.5.1 Family: RF-X (1, 2, 3, 4, 5,ANK)  1.6 Class: bHSH 2 Superclass: Zinc-coordinating DNA-bindingdomains  2.1 Class: Cys4 zinc finger of nuclear receptor type   2.1.1Family: Steroid hormone receptors   2.1.2 Family: Thyroid hormonereceptor-like factors  2.2 Class: diverse Cys4 zinc fingers   2.2.1Family: GATA-Factors  2.3 Class: Cys2His2 zinc finger domain   2.3.1Family: Ubiquitous factors, includes TFIIIA, Sp1   2.3.2 Family:Developmental/cell cycle regulators; includes Kruppel   2.3.4 Family:Large factors with NF-6B-like binding properties  2.4 Class: Cys6cysteine-zinc cluster  2.5 Class: Zinc fingers of alternatingcomposition 3 Superclass: Helix-turn-helix  3.1 Class: Homeo domain  3.1.1 Family: Homeo domain only; includes Ubx   3.1.2 Family: POUdomain factors; includes Oct   3.1.3 Family: Homeo domain with LIMregion   3.1.4 Family: homeo domain plus zinc finger motifs  3.2 Class:Paired box   3.2.1 Family: Paired plus homeo domain   3.2.2 Family:Paired domain only  3.3 Class: Fork head/winged helix   3.3.1 Family:Developmental regulators; includes forkhead   3.3.2 Family:Tissue-specific regulators   3.3.3 Family: Cell-cycle controllingfactors   3.3.0 Family: Other regulators  3.4 Class: Heat Shock Factors  3.4.1 Family: HSF  3.5 Class: Tryptophan clusters   3.5.1 Family: Myb  3.5.2 Family: Ets-type   3.5.3 Family: Interferon regulatory factors 3.6 Class: TEA (transcriptional enhancer factor) domain   3.6.1 Family:TEA (TEAD1, TEAD2, TEAD3, TEAD4) 4 Superclass: beta-Scaffold Factorswith Minor Groove Contacts  4.1 Class: RHR (Rel homology region)   4.1.1Family: Rel/ankyrin; NF-kappaB   4.1.2 Family: ankyrin only   4.1.3Family: NFAT (Nuclear Factor of Activated T-cells) (NFATC1,    NFATC2,NFATC3)  4.2 Class: STAT   4.2.1 Family: STAT  4.3 Class: p53   4.3.1Family: p53  4.4 Class: MADS box   4.4.1 Family: Regulators ofdifferentiation; includes (Mef2)   4.4.2 Family: Responders to externalsignals, SRF (serum response   factor) (SRF)  4.5 Class: beta-Barrelalpha-helix transcription factors  4.6 Class: TATA binding proteins  4.6.1 Family: TBP   4.7.1 Family: SOX genes, SRY   4.7.2 Family: TCF-1(TCF1)   4.7.3 Family: HMG2-related, SSRP1   4.7.5 Family: MATA  4.8Class: Heteromeric CCAAT factors   4.8.1 Family: Heteromeric CCAATfactors  4.9 Class: Grainyhead   4.9.1 Family: Grainyhead  4.10 Class:Cold-shock domain factors   4.10.1 Family: csd  4.11 Class: Runt  4.11.1 Family: Runt 0 Superclass: Other Transcription Factors  0.1Class: Copper fist proteins  0.2 Class: HMGI(Y) (HMGA1)   0.2.1 Family:HMGI(Y)  0.3 Class: Pocket domain  0.4 Class: E1A-like factors  0.5Class: AP2/EREBP-related factors   0.5.1 Family: AP2   0.5.2 Family:EREBP   0.5.3 Superfamily: AP2/B3    0.5.3.1 Family: ARF    0.5.3.2Family: ABI    0.5.3.3 Family: RAV

In certain embodiments, a subset of the mutation proposed in Table 1 fora particular protein are made to create the supercharged protein. Incertain embodiments, at least two mutations are made. In certainembodiments, at least three mutations are made. In certain embodiments,at least four mutations are made. In certain embodiments, at least fivemutations are made. In certain embodiments, at least ten mutations aremade. In certain embodiments, at least fifteen mutations are made. Incertain embodiments, at least twenty mutations are made. In certainembodiments, all the proposed mutations are made to create thesuperpositively charged protein. In certain embodiments, none of theproposed mutations are made but rather one or more charged moieties areadded to the protein to create the superpositively charged protein.

In certain embodiments, the supercharged protein is a naturallyoccurring supercharged protein. In certain embodiments, the theoreticalnet charge on the naturally occurring supercharged protein is at least+1, at least +2, at least +3, at least +4, at least +5, at least +10, atleast +15, at least +20, at least +25, at least +30, at least +35, or atleast +40. In certain embodiments, the supercharged protein has acharge:molecular weight ratio of at least approximately 0.8. In certainembodiments, the supercharged protein has a charge:molecular weightratio of at least approximately 1.0. In certain embodiments, thesupercharged protein has a charge:molecular weight ratio of at leastapproximately 1.2. In certain embodiments, the supercharged protein hasa charge:molecular weight ratio of at least approximately 1.4. Incertain embodiments, the supercharged protein has a charge:molecularweight ratio of at least approximately 1.5. In certain embodiments, thesupercharged protein has a charge:molecular weight ratio of at leastapproximately 1.6. In certain embodiments, the supercharged protein hasa charge:molecular weight ratio of at least approximately 1.7. Incertain embodiments, the supercharged protein has a charge:molecularweight ratio of at least approximately 1.8. In certain embodiments, thesupercharged protein has a charge:molecular weight ratio of at leastapproximately 1.9. In certain embodiments, the supercharged protein hasa charge:molecular weight ratio of at least approximately 2.0. Incertain embodiments, the supercharged protein has a charge:molecularweight ratio of at least approximately 2.5. In certain embodiments, thesupercharged protein has a charge:molecular weight ratio of at leastapproximately 3.0. In certain embodiments, the molecular weight of theprotein ranges from approximately 4 kDa to approximately 100 kDa. Incertain embodiments, the molecular weight of the protein ranges fromapproximately 10 kDa to approximately 45 kDa. In certain embodiments,the molecular weight of the protein ranges from approximately 5 kDa toapproximately 50 kDa. In certain embodiments, the molecular weight ofthe protein ranges from approximately 10 kDa to approximately 60 kDa. Incertain embodiments, the naturally occurring supercharged protein ishistone related. In certain embodiments, the naturally occurringsupercharged protein is ribosome related. Examples of naturallyoccurring supercharged proteins include, but are not limited to, cyclon(ID No.: Q9H6F5); PNRC1 (ID No.: Q12796); RNPS1 (ID No.: Q15287); SURF6(ID No.: O75683); AR6P (ID No.: Q66PJ3); NKAP (ID No.: Q8N5F7); EBP2 (IDNo.: Q99848); LSM11 (ID No.: P83369); RL4 (ID No.: P36578); KRR1 (IDNo.: Q13601); RY-1 (ID No.: Q8WVK2); BriX (ID No.: Q8TDN6); MNDA (IDNo.: P41218); H1b (ID No.: P16401); cyclin (ID No.: Q9UK58); MDK (IDNo.: P21741); Midkine (ID No.: P21741); PROK (ID No.: Q9HC23); FGF5 (IDNo.: P12034); SFRS (ID No.: Q8N9Q2); AKIP (ID No.: Q9NWT8); CDK (ID No.:Q8N726); beta-defensin (ID No.: P81534); Defensin 3 (ID No.: P81534);PAVAC (ID No.: P18509); PACAP (ID No.: P18509); eotaxin-3 (ID No.:Q9Y258); histone H2A (ID No.: Q7L7L0); HMGB1 (ID No.: P09429); C-Jun (IDNo.: P05412); TERF 1 (ID No.: P54274); N-DEK (ID No.: P35659); PIAS 1(ID No.: O75925); Ku70 (ID No.: P12956); HBEGF (ID No.: Q99075); and HGF(ID No.: P14210). In certain embodiments, the supercharged proteinutilized in the invention is U4/U6.U5 tri-snRNP-associated protein 3 (IDNo.: Q8WVK2); beta-defensin (ID No.: P81534); Protein SFRS12IP1 (ID No.:Q8N9Q2); midkine (ID No.: P21741); C-C motif chemokine 26 (ID No.:Q9Y258); surfeit locus protein 6 (ID No.: O75683); Aurora kinaseA-interacting protein (ID No.: Q9NWT8); NF-kappa-B-activating protein(ID No.: Q8N5F7); histone H1.5 (ID No.: P16401); histone H2A type 3 (IDNo.: Q7L7L0); 60S ribosomal protein L4 (ID No.: P36578); isoform 1 ofRNA-binding protein with serine-rich domain 1 (ID No.: Q15287-1);isoform 4 of cyclin-dependent kinase inhibitor 2A (ID No.: Q8N726-I);isoform 1 of prokineticin-2 (ID No.: Q9HC23-1); isoform 1 ofADP-ribosylation factor-like protein 6-interacting protein 4 (ID No.:Q66PJ3-1); isoform long of fibroblast growth factor 5 (ID No.:P12034-1); or isoform 1 of cyclin-L1 (ID No.: Q9UK58-1). Other possiblenaturally occurring supercharged proteins from the human proteome thatmay be utilized in the present invention are included in the list below.The proteins listed have a charge:molecular weight ratio of greater than0.8.

Ratio  Charge        Name        aa    MW Cationic Proteins [‘3.49’, 23,‘sp|P04553|HSP1_HUMAN Sperm protamine-P1 OS = Homo sapiens GN = PRM1’,51, 6822] [‘3.00’, 19, ‘sp|P09430|STP1_HUMAN Spermatid nucleartransition protein 1 OS = Homo sapiens GN = TNP1’, 55, 6424] [‘2.19’,23, ‘sp|Q9UNZ5|L10K_HUMAN Leydig cell tumor 10 kDa protein homolog OS =Homo sapiens GN = C19orf53’, 99, 10576] [‘2.07’, 27,‘sp|P04554|PRM2_HUMAN Protamine-2 OS = Homo sapiens GN = PRM2’, 102,13050] [‘1.80’, 18, ‘sp|Q5EE01|CUG2_HUMAN Cancer-up-regulated gene 2protein OS = Homo sapiens GN = C6orf173’, 88, 10061] [‘1.78’, 17,‘sp|O00479|HMGN4_HUMAN High mobility group nucleosome-bindingdomain-containing protein 4 OS = Homo sapiens GN = HMGN4’, 90, 9538][‘1.65’, 25, ‘sp|Q9BRT6|CL031_HUMAN UPF0446 protein C12orf31 OS = Homosapiens GN = C12orf31’, 129, 15225] [‘1.62’, 80, ‘sp|Q8IV32|CCD71_HUMANCoiled-coil domain-containing protein 71 OS = Homo sapiens GN = CCDC71’,467, 49618] [‘1.59’, 24, ‘sp|Q05952|STP2_HUMAN Nuclear transitionprotein 2 OS = Homo sapiens GN = TNP2’, 138, 15640] [‘1.57’, 22,‘sp|Q07325|CXCL9_HUMAN C—X—C motif chemokine 9 OS = Homo sapiens GN =CXCL9’, 125, 14018] [‘1.56’, 11, ‘sp|Q9Y2S6|CCD72_HUMAN Coiled-coildomain-containing protein 72 OS = Homo sapiens GN = CCDC72’, 64, 7066][‘1.55’, 29, ‘sp|Q8WVK2|SNUT3_HUMAN U4/U6.U5 tri-snRNP-associatedprotein 3 OS = Homo sapiens’, 155, 18860] [‘1.55’, 11,‘sp|P81534|D103A_HUMAN Beta-defensin 103 OS = Homo sapiens GN =DEFB103A’, 67, 7697] [‘1.54’, 8, ‘sp|Q5VTU8|AT5EL_HUMAN ATP synthasesubunit epsilon-like protein, mitochondrial OS = Homo sapiens GN =ATP5EP2’, 51, 5806] [‘1.45’, 10, ‘sp|P84101|SERF2_HUMAN Small EDRK-richfactor 2 OS = Homo sapiens GN = SERF2’, 59, 6899] [‘1.40’, 102,‘sp|A6NNA2|SRR2L_HUMAN SRRM2-like protein OS = Homo sapiens’, 665,72877] [‘1.39’, 40, ‘sp|Q8N9E0|F133A_HUMAN Protein FAM133A OS = Homosapiens GN = FAM133A’, 248, 28940] [‘1.38’, 35, ‘sp|A6NF02|NPPL2_HUMANNPIP-like protein ENSP00000346774 OS = Homo sapiens’, 221, 26005][‘1.37’, 11, ‘sp|Q7Z4L0|COX83_HUMAN Cytochrome c oxidase polypeptide 8C,mitochondrial OS = Homo sapiens GN = COX8C’, 72, 8128] [‘1.35’, 34,‘sp|O75200|NPPL1_HUMAN NPIP-like protein LOC440350 OS = Homo sapiens’,221, 25868] [‘1.32’, 18, ‘sp|Q6UXB2|VCC1_HUMAN VEGF co-regulatedchemokine 1 OS = Homo sapiens GN = CXCL17’, 119, 13819] [‘1.32’, 10,‘sp|Q8N688|DB123_HUMAN Beta-defensin 123 OS = Homo sapiens GN =DEFB123’, 67, 8104] [‘1.31’, 36, ‘sp|Q5U4N7|GDF5O_HUMAN Protein GDF5OS,mitochondrial OS = Homo sapiens GN = GDF5OS’, 250, 28153] [‘1.31’, 12,‘sp|O00198|HRK_HUMAN Activator of apoptosis harakiri OS = Homo sapiensGN = HRK’, 91, 9883] [‘1.30’, 29, ‘sp|Q8WW32|HMGB4_HUMAN High mobilitygroup protein B4 OS = Homo sapiens GN = HMGB4’, 186, 22404] [‘1.28’, 23,‘sp|Q8N9Q2|S12IP_HUMAN Protein SFRS12IP1 OS = Homo sapiens GN =SFRS12IP1’, 155, 18176] [‘1.26’, 19, ‘sp|P21741|MK_HUMAN Midkine OS =Homo sapiens GN = MDK’, 143, 15585] [‘1.26’, 16, ‘sp|Q08E93|F27E3_HUMANProtein FAM27E3 OS = Homo sapiens GN = FAM27E3’, 113, 13507] [‘1.23’,44, ‘sp|Q96QD9|FYTD1_HUMAN Forty-two-three domain-containing protein 1OS = Homo sapiens GN = FYTTD1’, 318, 35799] [‘1.23’, 16,‘sp|P62314|SMD1_HUMAN Small nuclear ribonucleoprotein Sm D1 OS = Homosapiens GN = SNRPD1’, 119, 13281] [‘1.23’, 13, ‘sp|Q9Y258|CCL26_HUMANC-C motif chemokine 26 OS = Homo sapiens GN = CCL26’, 94, 10647][‘1.22’, 10, ‘sp|Q96PI1|SPRR4_HUMAN Small proline-rich protein 4 OS =Homo sapiens GN = SPRR4’, 79, 8793] [‘1.21’, 24, ‘sp|B2CW77|KILIN_HUMANKillin OS = Homo sapiens’, 178, 19957] [‘1.20’, 10,‘sp|Q9Y5V0|ZN706_HUMAN Zinc finger protein 706 OS = Homo sapiens GN =ZNF706’, 76, 8497] [‘1.20’, 6, ‘sp|P56381|ATP5E_HUMAN ATP synthasesubunit epsilon, mitochondrial OS = Homo sapiens GN = ATP5E’, 51, 5779][‘1.19’, 61, ‘sp|Q9HAH1|ZN556_HUMAN Zinc finger protein 556 OS = Homosapiens GN = ZNF556’, 456, 51581] [‘1.19’, 30, ‘sp|P17026|ZNF22_HUMANZinc finger protein 22 OS = Homo sapiens GN = ZNF22’, 224, 25915][‘1.18’, 16, ‘sp|Q9NRJ3|CCL28_HUMAN C-C motif chemokine 28 OS = Homosapiens GN = CCL28’, 127, 14279] [‘1.16’, 11, ‘sp|O43262|LEU2_HUMANLeukemia-associated protein 2 OS = Homo sapiens GN = DLEU2’, 84, 10196][‘1.15’, 38, ‘sp|Q6PK04|CC137_HUMAN Coiled-coil domain-containingprotein 137 OS = Homo sapiens GN = CCDC137’, 289, 33231] [‘1.15’, 18,‘sp|A8MYZ5|YC026_HUMAN IQ domain-containing protein ENSP00000381760 OS =Homo sapiens’, 130, 15797] [‘1.15’, 16, ‘sp|Q5T7N7|F27E1_HUMAN ProteinFAM27E1 OS = Homo sapiens GN = FAM27E1’, 126, 14751] [‘1.15’, 16,‘sp|Q5SNX5|F27E2_HUMAN Protein FAM27E2 OS = Homo sapiens GN = FAM27E2’,125, 14710] [‘1.15’, 16, ‘sp|O00585|CCL21_HUMAN C-C motif chemokine 21OS = Homo sapiens GN = CCL21’, 134, 14646] [‘1.15’, 6,‘sp|Q13794|APR_HUMAN Phorbol-12-myristate-13-acetate-induced protein 1OS = Homo sapiens GN = PMAIP1’, 54, 6030] [‘1.14’, 13,‘sp|P19875|MIP2A_HUMAN Macrophage inflammatory protein 2-alpha OS = Homosapiens GN = CXCL2’, 107, 11388] [‘1.14’, 12, ‘sp|Q9P021|CRIPT_HUMANCysteine-rich PDZ-binding protein OS = Homo sapiens GN = CRIPT’, 101,11215] [‘1.14’, 11, ‘sp|O14625|CXL11_HUMAN C—X—C motif chemokine 11 OS =Homo sapiens GN = CXCL11’, 94, 10364] [‘1.13’, 10, ‘sp|P61580|NP10_HUMANHERV-K_5q33.3 provirus Np9 protein OS = Homo sapiens’, 75, 8892][‘1.12’, 46, ‘sp|O75683|SURF6_HUMAN Surfeit locus protein 6 OS = Homosapiens GN = SURF6’, 361, 41450] [‘1.12’, 15, ‘sp|P0C7P0|CISD3_HUMANCDGSH iron sulfur domain-containing protein 3, mitochondrial OS = Homosapiens GN = CISD3’, 127, 14215] [‘1.10’, 37, ‘sp|Q9Y2B4|T53G5_HUMANTP53-target gene 5 protein OS = Homo sapiens GN = TP53TG5’, 290, 34019][‘1.10’, 33, ‘sp|Q9Y3A2|UTP11_HUMAN Probable U3 small nucleolarRNA-associated protein 11 OS = Homo sapiens GN = UTP11L’, 253, 30446][‘1.10’, 21, ‘sp|Q9HCT0|FGF22_HUMAN Fibroblast growth factor 22 OS =Homo sapiens GN = FGF22’, 170, 19662] [‘1.10’, 11,‘sp|P51671|CCL11_HUMAN Eotaxin OS = Homo sapiens GN = CCL11’, 97, 10731][‘1.09’, 14, ‘sp|Q9Y421|FA32A_HUMAN Protein FAM32A OS = Homo sapiens GN= FAM32A’, 112, 13178] [‘1.09’, 12, ‘sp|Q2M2W7|CQ058_HUMAN UPF0450protein C17orf58 OS = Homo sapiens GN = C17orf58’, 97, 11205] [‘1.09’,11, ‘sp|Q99616|CCL13_HUMAN C-C motif chemokine 13 OS = Homo sapiens GN =CCL13’, 98, 10986] [‘1.09’, 11, ‘sp|P0C665|PRAC2_HUMAN Small nuclearprotein PRAC2 OS = Homo sapiens GN = PRAC2’, 90, 10483] [‘1.09’, 11,‘sp|P0C0P6|NPS_HUMAN Neuropeptide S OS = Homo sapiens GN = NPS’, 89,10103] [‘1.08’, 21, ‘sp|Q8IXL9|IQCF2_HUMAN IQ domain-containing proteinF2 OS = Homo sapiens GN = IQCF2’, 164, 19627] [‘1.08’, 8,‘sp|Q13891|BT3L2_HUMAN Transcription factor BTF3 homolog 2 OS = Homosapiens GN = BTF3L2’, 67, 7605] [‘1.08’, 7, ‘sp|P56378|68MP_HUMAN 6.8kDa mitochondrial proteolipid OS = Homo sapiens GN = MP68’, 58, 6662][‘1.08’, 6, ‘sp|P15516|HIS3_HUMAN Histatin-3 OS = Homo sapiens GN =HTN3’, 51, 6149] [‘1.07’, 26, ‘sp|Q5T7N8|F27D1_HUMAN Protein FAM27D1 OS= Homo sapiens GN = FAM27D1’, 215, 24905] [‘1.07’, 24,‘sp|Q9NWT8|AKIP_HUMAN Aurora kinase A-interacting protein OS = Homosapiens GN = AURKAIP1’, 199, 22354] [‘1.07’, 16, ‘sp|A8MQ11|PM2L5_HUMANPostmeiotic segregation increased 2-like protein 5 OS = Homo sapiens GN= PMS2L5’, 134, 15169] [‘1.07’, 15, ‘sp|Q6UXT8|F150A_HUMAN ProteinFAM150A OS = Homo sapiens GN = FAM150A’, 129, 14268] [‘1.06’, 61,‘sp|Q14593|ZN273_HUMAN Zinc finger protein 273 OS = Homo sapiens GN =ZNF273’, 504, 58045] [‘1.06’, 9, ‘sp|Q9ULZ1|APEL_HUMAN Apelin OS = Homosapiens GN = APLN’, 77, 8569] [‘1.05’, 10, ‘sp|Q9UGL9|CRCT1_HUMANCysteine-rich C-terminal protein 1 OS = Homo sapiens GN = CRCT1’, 99,9735] [‘1.05’, 10, ‘sp|P81277|PRRP_HUMAN Prolactin-releasing peptide OS= Homo sapiens GN = PRLH’, 87, 9639] [‘1.04’, 31, ‘sp|P52744|ZN138_HUMANZinc finger protein 138 OS = Homo sapiens GN = ZNF138’, 262, 30591][‘1.04’, 11, ‘sp|Q6IPR1|LYRM5_HUMAN LYR motif-containing protein 5 OS =Homo sapiens GN = LYRM5’, 88, 10604] [‘1.04’, 9, ‘sp|P09669|COX6C_HUMANCytochrome c oxidase polypeptide VIc OS = Homo sapiens GN = COX6C’, 75,8781] [‘1.04’, 7, ‘sp|Q9NRQ5|CK075_HUMAN UPF0443 protein C11orf75 OS =Homo sapiens GN = C11orf75’, 59, 6738] [‘1.03’, 23,‘sp|Q8NHZ7|MB3L2_HUMAN Methyl-CpG-binding domain protein 3-like 2 OS =Homo sapiens GN = MBD3L2’, 204, 22695] [‘1.03’, 11,‘sp|Q9HD34|LYRM4_HUMAN LYR motif-containing protein 4 OS = Homo sapiensGN = LYRM4’, 91, 10758] [‘1.03’, 10, ‘sp|Q06250|WIT1_HUMAN Wilmstumor-associated protein OS = Homo sapiens GN = WIT1’, 92, 10038][‘1.02’, 40, ‘sp|Q9NP08|HMX1_HUMAN Homeobox protein HMX1 OS = Homosapiens GN = HMX1’, 373, 39225] [‘1.02’, 15, ‘sp|Q9H963|ZN702_HUMAN Zincfinger protein 702 OS = Homo sapiens GN = ZNF702’, 129, 15053] [‘1.02’,14, ‘sp|P37108|SRP14_HUMAN Signal recognition particle 14 kDa protein OS= Homo sapiens GN = SRP14’, 136, 14569] [‘1.02’, 12,‘sp|P52926|HMGA2_HUMAN High mobility group protein HMGI-C OS = Homosapiens GN = HMGA2’, 109, 11832] [‘1.02’, 7, ‘sp|P58511|F165B_HUMANUPF0601 protein FAM165B OS = Homo sapiens GN = FAM165B’, 58, 6886][‘1.01’, 24, ‘sp|P52743|ZN137_HUMAN Zinc finger protein 137 OS = Homosapiens GN = ZNF137’, 207, 24114] [‘1.01’, 18, ‘sp|Q8N912|CN180_HUMANTransmembrane protein C14orf180 OS = Homo sapiens GN = C14orf180’, 160,18051] [‘1.01’, 14, ‘sp|Q8N8V8|TM105_HUMAN Transmembrane protein 105 OS= Homo sapiens GN = TMEM105’, 129, 13990] [‘1.01’, 14,‘sp|Q5TZK3|F74A4_HUMAN Protein FAM74A4 OS = Homo sapiens GN = FAM74A4’,123, 14772] [‘1.01’, 14, ‘sp|P42127|ASIP_HUMAN Agouti-signaling proteinOS = Homo sapiens GN = ASIP’, 132, 14515] [‘1.01’, 10,‘sp|P60468|SC61B_HUMAN Protein transport protein Sec61 subunit beta OS =Homo sapiens GN = SEC61B’, 96, 9974] [‘1.01’, 9, ‘sp|P61581|NP11_HUMANHERV-K_22q11.21 provirus Np9 protein OS = Homo sapiens’, 75, 8893][‘1.00’, 72, ‘sp|Q6ZQV5|ZN788_HUMAN Zinc finger protein 788 OS = Homosapiens GN = ZNF788’, 615, 71992] [‘1.00’, 70, ‘sp|Q5HYK9|ZN667_HUMANZinc finger protein 667 OS = Homo sapiens GN = ZNF667’, 610, 70157][‘1.00’, 26, ‘sp|Q9H0W7|THAP2_HUMAN THAP domain-containing protein 2 OS= Homo sapiens GN = THAP2’, 228, 26259] [‘0.99’, 20,‘sp|P35318|ADML_HUMAN ADM OS = Homo sapiens GN = ADM’, 185, 20420][‘0.99’, 18, ‘sp|P21246|PTN_HUMAN Pleiotrophin OS = Homo sapiens GN =PTN’, 168, 18942] [‘0.99’, 13, ‘sp|P23582|ANFC_HUMAN C-type natriureticpeptide OS = Homo sapiens GN = NPPC’, 126, 13246] [‘0.99’, 10,‘sp|P02778|CXL10_HUMAN C—X—C motif chemokine 10 OS = Homo sapiens GN =CXCL10’, 98, 10881] [‘0.98’, 15, ‘sp|P14555|PA2GA_HUMAN PhospholipaseA2, membrane associated OS = Homo sapiens GN = PLA2G2A’, 144, 16082][‘0.98’, 12, ‘sp|Q8NDT4|ZN663_HUMAN Zinc finger protein 663 OS = Homosapiens GN = ZNF663’, 106, 12434] [‘0.98’, 12, ‘sp|O00175|CCL24_HUMANC-C motif chemokine 24 OS = Homo sapiens GN = CCL24’, 119, 13133][‘0.97’, 17, ‘sp|Q5T6X4|F162B_HUMAN UPF0389 protein FAM162B OS = Homosapiens GN = FAM162B’, 162, 17684] [‘0.97’, 15, ‘sp|Q7Z4H4|ADM2_HUMANADM2 OS = Homo sapiens GN = ADM2’, 148, 15865] [‘0.97’, 11,‘sp|P09341|GROA_HUMAN Growth-regulated alpha protein OS = Homo sapiensGN = CXCL1’, 107, 11301] [‘0.97’, 6, ‘sp|O15263|BD02_HUMAN Beta-defensin2 OS = Homo sapiens GN = DEFB4’, 64, 7037] [‘0.96’, 40,‘sp|Q96N58|ZN578_HUMAN Zinc finger protein 578 OS = Homo sapiens GN =ZNF578’, 365, 42596] [‘0.96’, 19, ‘sp|Q9NPH9|IL26_HUMAN Interleukin-26OS = Homo sapiens GN = IL26’, 171, 19842] [‘0.96’, 19,‘sp|Q8NHX4|SPTA3_HUMAN Spermatogenesis-associated protein 3 OS = Homosapiens GN = SPATA3’, 183, 19948] [‘0.96’, 16, ‘sp|P59020|DSCR9_HUMANDown syndrome critical region protein 9 OS = Homo sapiens GN = DSCR9’,149, 16743] [‘0.96’, 8, ‘sp|Q3LI70|KR196_HUMAN Keratin-associatedprotein 19-6 OS = Homo sapiens GN = KRTAP19-6’, 84, 9125] [‘0.96’, 7,‘sp|Q9Y6X1|SERP1_HUMAN Stress-associated endoplasmic reticulum protein 1OS = Homo sapiens GN = SERP1’, 66, 7373] [‘0.96’, 4,‘sp|Q9P0U5|INGX_HUMAN Inhibitor of growth protein, X-linked OS = Homosapiens GN = INGX’, 42, 5076] [‘0.95’, 7, ‘sp|Q8N6R1|SERP2_HUMANStress-associated endoplasmic reticulum protein 2 OS = Homo sapiens GN =SERP2’, 65, 7430] [‘0.94’, 33, ‘sp|Q9H7B2|BXDC1_HUMAN Brixdomain-containing protein 1 OS = Homo sapiens GN = BXDC1’, 306, 35582][‘0.94’, 17, ‘sp|Q96MF4|CC140_HUMAN Coiled-coil domain-containingprotein 140 OS = Homo sapiens GN = CCDC140’, 163, 18252] [‘0.94’, 16,‘sp|Q8WW36|ZCH13_HUMAN Zinc finger CCHC domain-containing protein 13 OS= Homo sapiens GN = ZCCHC13’, 166, 18005] [‘0.94’, 12,‘sp|O60519|CRBL2_HUMAN cAMP-responsive element-binding protein-like 2 OS= Homo sapiens GN = CREBL2’, 120, 13783] [‘0.93’, 16,‘sp|Q9H1E1|RNAS7_HUMAN Ribonuclease 7 OS = Homo sapiens GN = RNASE7’,156, 17471] [‘0.93’, 16, ‘sp|Q14236|EPAG_HUMAN Early lymphoid activationgene protein OS = Homo sapiens GN = EPAG’, 149, 17843] [‘0.93’, 16,‘sp|P0C7M6|IQCF3_HUMAN IQ domain-containing protein F3 OS = Homo sapiensGN = IQCF3’, 154, 18250] [‘0.93’, 11, ‘sp|O43927|CXL13_HUMAN C—X—C motifchemokine 13 OS = Homo sapiens GN = CXCL13’, 109, 12664] [‘0.93’, 9,‘sp|Q9Y6G1|TM14A_HUMAN Transmembrane protein 14A OS = Homo sapiens GN =TMEM14A’, 99, 10712] [‘0.93’, 9, ‘sp|Q7Z7B7|DB132_HUMAN Beta-defensin132 OS = Homo sapiens GN = DEFB132’, 95, 10610] [‘0.93’, 8,‘sp|Q5T5B0|LCE3E_HUMAN Late cornified envelope protein 3E OS = Homosapiens GN = LCE3E’, 92, 9506] [‘0.93’, 7, ‘sp|Q9NPE3|NOLA3_HUMAN H/ACAribonucleoprotein complex subunit 3 OS = Homo sapiens GN = NOLA3’, 64,7705] [‘0.92’, 23, ‘sp|O95707|RPP29_HUMAN Ribonuclease P protein subunitp29 OS = Homo sapiens GN = POP4’, 220, 25424] [‘0.92’, 14,‘sp|Q9NPJ4|PNRC2_HUMAN Proline-rich nuclear receptor coactivator 2 OS =Homo sapiens GN = PNRC2’, 139, 15590] [‘0.92’, 11, ‘sp|O14599|VCY2_HUMANTestis-specific basic protein Y 2 OS = Homo sapiens GN = BPY2’, 106,12035] [‘0.92’, 8, ‘sp|Q8WVI0|U640_HUMAN UPF0640 protein OS = Homosapiens’, 70, 8696] [‘0.92’, 5, ‘sp|Q96IX5|USMG5_HUMAN Up-regulatedduring skeletal muscle growth protein 5 OS = Homo sapiens GN = USMG5’,58, 6457] [‘0.91’, 8, ‘sp|P61582|NP12_HUMAN HERV-K_1q22 provirus Np9protein OS = Homo sapiens’, 75, 8820] [‘0.90’, 81,‘sp|Q08AN1|ZN616_HUMAN Zinc finger protein 616 OS = Homo sapiens GN =ZNF616’, 781, 90263] [‘0.90’, 42, ‘sp|Q8N5F7|NKAP_HUMANNF-kappa-B-activating protein OS = Homo sapiens GN = NKAP’, 415, 47138][‘0.90’, 41, ‘sp|A6NM28|ZFP92_HUMAN Zinc finger protein 92 homolog OS =Homo sapiens GN = ZFP92’, 416, 45791] [‘0.90’, 35,‘sp|Q14093|CYLC2_HUMAN Cylicin-2 OS = Homo sapiens GN = CYLC2’, 348,39078] [‘0.90’, 18, ‘sp|Q6ZT77|ZN826_HUMAN Zinc finger protein 826 OS =Homo sapiens GN = ZNF826’, 177, 20579] [‘0.90’, 10,‘sp|Q5T751|LCE1C_HUMAN Late cornified envelope protein 1C OS = Homosapiens GN = LCE1C’, 118, 11543] [‘0.90’, 8, ‘sp|P61583|NP8_HUMANHERV-K_3q12.3 provirus Np9 protein OS = Homo sapiens GN = ERVK5’, 75,8907] [‘0.90’, 7, ‘sp|Q30KQ2|DB130_HUMAN Beta-defensin 130 OS = Homosapiens GN = DEFB130’, 79, 8735] [‘0.89’, 35, ‘sp|O75698|HUG1_HUMANProtein HUG-1 OS = Homo sapiens GN = HUG1’, 362, 39386] [‘0.89’, 22,‘sp|Q8N7Y1|PRR10_HUMAN Proline-rich protein 10 OS = Homo sapiens GN =PRR10’, 241, 25772] [‘0.89’, 22, ‘sp|Q5TFG8|F164B_HUMAN UPF0418 proteinFAM164B OS = Homo sapiens GN = FAM164B’, 222, 24665] [‘0.89’, 18,‘sp|Q7RTS1|BHLH8_HUMAN Class B basic helix-loop-helix protein 8 OS =Homo sapiens GN = BHLHB8’, 189, 20818] [‘0.89’, 10,‘sp|Q5T7P3|LCE1B_HUMAN Late cornified envelope protein 1B OS = Homosapiens GN = LCE1B’, 118, 11626] [‘0.89’, 10, ‘sp|Q5T754|LCE1F_HUMANLate cornified envelope protein 1F OS = Homo sapiens GN = LCE1F’, 118,11654] [‘0.89’, 10, ‘sp|P19876|MIP2B_HUMAN Macrophage inflammatoryprotein 2-beta OS = Homo sapiens GN = CXCL3’, 107, 11342] [‘0.89’, 9,‘sp|P80098|CCL7_HUMAN C-C motif chemokine 7 OS = Homo sapiens GN =CCL7’, 99, 11200] [‘0.89’, 7, ‘sp|Q969E1|LEAP2_HUMAN Liver-expressedantimicrobial peptide 2 OS = Homo sapiens GN = LEAP2’, 77, 8813][‘0.89’, 7, ‘sp|Q30KP9|DB135_HUMAN Beta-defensin 135 OS = Homo sapiensGN = DEFB135’, 77, 8753] [‘0.88’, 50, ‘sp|Q96CS4|ZN689_HUMAN Zinc fingerprotein 689 OS = Homo sapiens GN = ZNF689’, 500, 56906] [‘0.88’, 24,‘sp|Q5EBM4|ZN542_HUMAN Zinc finger protein 542 OS = Homo sapiens GN =ZNF542’, 241, 27663] [‘0.88’, 11, ‘sp|Q96BP2|CHCH1_HUMANCoiled-coil-helix-coiled-coil-helix domain-containing protein 1 OS =Homo sapiens GN = CHCHD1’, 118, 13474] [‘0.88’, 9,‘sp|Q6UX46|F150B_HUMAN Protein FAM150B OS = Homo sapiens GN = FAM150B’,91, 10541] [‘0.87’, 65, ‘sp|Q6ZR52|ZN493_HUMAN Zinc finger protein 493OS = Homo sapiens GN = ZNF493’, 646, 75341] [‘0.87’, 30,‘sp|Q99848|EBP2_HUMAN Probable rRNA-processing protein EBP2 OS = Homosapiens GN = EBNA1BP2’, 306, 34851] [‘0.87’, 12, ‘sp|P62318|SMD3_HUMANSmall nuclear ribonucleoprotein Sm D3 OS = Homo sapiens GN = SNRPD3’,126, 13916] [‘0.87’, 10, ‘sp|A0PJW8|DAPL1_HUMAN Death-associatedprotein-like 1 OS = Homo sapiens GN = DAPL1’, 107, 11879] [‘0.87’, 9,‘sp|Q5T7P2|LCE1A_HUMAN Late cornified envelope protein 1A OS = Homosapiens GN = LCE1A’, 110, 10982] [‘0.87’, 5, ‘sp|Q96KF2|PRAC_HUMAN Smallnuclear protein PRAC OS = Homo sapiens GN = PRAC’, 57, 5958] [‘0.86’,59, ‘sp|Q03923|ZNF85_HUMAN Zinc finger protein 85 OS = Homo sapiens GN =ZNF85’, 595, 68718] [‘0.86’, 54, ‘sp|Q6N045|ZNP12_HUMAN Zinc fingerprotein ZnFP12 OS = Homo sapiens’, 540, 62759] [‘0.86’, 43,‘sp|Q8IZC7|ZN101_HUMAN Zinc finger protein 101 OS = Homo sapiens GN =ZNF101’, 436, 50339] [‘0.86’, 41, ‘sp|P42696|RBM34_HUMAN RNA-bindingprotein 34 OS = Homo sapiens GN = RBM34’, 430, 48564] [‘0.86’, 20,‘sp|Q9Y324|FCF1_HUMAN rRNA-processing protein FCF1 homolog OS = Homosapiens GN = FCF1’, 198, 23369] [‘0.86’, 15, ‘sp|Q969E3|UCN3_HUMANUrocortin-3 OS = Homo sapiens GN = UCN3’, 161, 17861] [‘0.86’, 13,‘sp|P09132|SRP19_HUMAN Signal recognition particle 19 kDa protein OS =Homo sapiens GN = SRP19’, 144, 16155] [‘0.85’, 54,‘sp|Q9BWE0|REPI1_HUMAN Replication initiator 1 OS = Homo sapiens GN =REPIN1’, 567, 63574] [‘0.85’, 42, ‘sp|Q8NCK3|ZN485_HUMAN Zinc fingerprotein 485 OS = Homo sapiens GN = ZNF485’, 441, 50280] [‘0.85’, 22,‘sp|P11487|FGF3_HUMAN INT-2 proto-oncogene protein OS = Homo sapiens GN= FGF3’, 239, 26886] [‘0.85’, 19, ‘sp|Q99748|NRTN_HUMAN Neurturin OS =Homo sapiens GN = NRTN’, 197, 22405] [‘0.85’, 6, ‘sp|P15954|COX7C_HUMANCytochrome c oxidase subunit 7C, mitochondrial OS = Homo sapiens GN =COX7C’, 63, 7245] [‘0.84’, 42, ‘sp|Q8N8L2|ZN491_HUMAN Zinc fingerprotein 491 OS = Homo sapiens GN = ZNF491’, 437, 50949] [‘0.84’, 22,‘sp|Q86XF7|ZN575_HUMAN Zinc finger protein 575 OS = Homo sapiens GN =ZNF575’, 245, 26763] [‘0.84’, 9, ‘sp|Q5T752|LCE1D_HUMAN Late cornifiedenvelope protein 1D OS = Homo sapiens GN = LCE1D’, 114, 11229] [‘0.84’,6, ‘sp|Q9NRX6|T167B_HUMAN Transmembrane protein 167B OS = Homo sapiensGN = TMEM167B’, 74, 8294] [‘0.84’, 5, ‘sp|P80294|MT1H_HUMANMetallothionein-1H OS = Homo sapiens GN = MT1H’, 61, 6039] [‘0.83’, 50,‘sp|Q9P255|ZN492_HUMAN Zinc finger protein 492 OS = Homo sapiens GN =ZNF492’, 531, 61158] [‘0.83’, 50, ‘sp|A6NK75|ZNF98_HUMAN Zinc fingerprotein 98 OS = Homo sapiens GN = ZNF98’, 531, 61144] [‘0.83’, 32,‘sp|O15480|MAGB3_HUMAN Melanoma-associated antigen B3 OS = Homo sapiensGN = MAGEB3’, 346, 39179] [‘0.83’, 29, ‘sp|Q96GY0|F164A_HUMAN UPF0418protein FAM164A OS = Homo sapiens GN = FAM164A’, 325, 35062] [‘0.83’,26, ‘sp|Q96PP4|TSG13_HUMAN Testis-specific gene 13 protein OS = Homosapiens GN = TSGA13’, 275, 31777] [‘0.83’, 17, ‘sp|O15499|GSC2_HUMANHomeobox protein goosecoid-2 OS = Homo sapiens GN = GSC2’, 205, 21544][‘0.83’, 10, ‘sp|P56847|TNG2_HUMAN Protein TNG2 OS = Homo sapiens GN =TNG2’, 110, 12856] [‘0.83’, 7, ‘sp|Q9BYE3|LCE3D_HUMAN Late cornifiedenvelope protein 3D OS = Homo sapiens GN = LCE3D’, 92, 9443] [‘0.83’, 5,‘sp|P07438|MT1B_HUMAN Metallothionein-1B OS = Homo sapiens GN = MT1B’,61, 6115] [‘0.82’, 31, ‘sp|Q6AZW8|ZN660_HUMAN Zinc finger protein 660 OS= Homo sapiens GN = ZNF660’, 331, 38270] [‘0.82’, 11,‘sp|O43612|OREX_HUMAN Orexin OS = Homo sapiens GN = HCRT’, 131, 13362][‘0.82’, 10, ‘sp|Q96DA6|TIM14_HUMAN Mitochondrial import inner membranetranslocase subunit TIM14 OS = Homo sapiens GN = DNAJC19’, 116, 12498][‘0.82’, 9, ‘sp|Q96A98|TIP39_HUMAN Tuberoinfundibular peptide of 39residues OS = Homo sapiens GN = PTH2’, 100, 11202] [‘0.82’, 9,‘sp|P80162|CXCL6_HUMAN C—X—C motif chemokine 6 OS = Homo sapiens GN =CXCL6’, 114, 11897] [‘0.81’, 23, ‘sp|Q9P031|TAP26_HUMAN Thyroidtranscription factor 1-associated protein 26 OS = Homo sapiens GN =CCDC59’, 241, 28669] [‘0.81’, 11, ‘sp|Q6ZST2|ZCH23_HUMAN Zinc fingerCCHC domain-containing protein 23 OS = Homo sapiens GN = ZCCHC23’, 131,14409] [‘0.81’, 11, ‘sp|P62316|SMD2_HUMAN Small nuclearribonucleoprotein Sm D2 OS = Homo sapiens GN = SNRPD2’, 118, 13526][‘0.81’, 10, ‘sp|O95182|NDUA7_HUMAN NADH dehydrogenase [ubiquinone] 1alpha subcomplex subunit 7 OS = Homo sapiens GN = NDUFA7’, 113, 12551][‘0.81’, 10, ‘sp|A6NFY7|LYRM8_HUMAN LYR motif-containing proteinENSP00000368165 OS = Homo sapiens’, 115, 12806] [‘0.81’, 7,‘sp|Q7Z3B0|CE043_HUMAN UPF0542 protein C5orf43 OS = Homo sapiens GN =C5orf43’, 74, 8625] [‘0.80’, 72, ‘sp|Q9UII5|ZN107_HUMAN Zinc fingerprotein 107 OS = Homo sapiens GN = ZNF107’, 783, 90672] [‘0.80’, 69,‘sp|Q9Y3M9|ZN337_HUMAN Zinc finger protein 337 OS = Homo sapiens GN =ZNF337’, 751, 86874] [‘0.80’, 49, ‘sp|Q5SXM1|ZN678_HUMAN Zinc fingerprotein 678 OS = Homo sapiens GN = ZNF678’, 525, 61411] [‘0.80’, 47,‘sp|Q96BV0|ZN775_HUMAN Zinc finger protein 775 OS = Homo sapiens GN =ZNF775’, 537, 59751] [‘0.80’, 40, ‘sp|P51522|ZNF83_HUMAN Zinc fingerprotein 83 OS = Homo sapiens GN = ZNF83’, 428, 49778] [‘0.80’, 19,‘sp|Q9UGY1|NOL12_HUMAN Nucleolar protein 12 OS = Homo sapiens GN =NOL12’, 213, 24662] [‘0.80’, 19, ‘sp|O76093|FGF18_HUMAN Fibroblastgrowth factor 18 OS = Homo sapiens GN = FGF18’, 207, 23988] [‘0.80’, 16,‘sp|P20800|EDN2_HUMAN Endothelin-2 OS = Homo sapiens GN = EDN2’, 178,19959] [‘0.80’, 8, ‘sp|Q9NRX3|NUA4L_HUMAN NADH dehydrogenase[ubiquinone] 1 alpha subcomplex subunit 4-like 2 OS = Homo sapiens GN =NDUFA4L2’, 87, 9965] [‘0.80’, 8, ‘sp|Q02221|CX6A2_HUMAN Cytochrome coxidase polypeptide 6A2, mitochondrial OS = Homo sapiens GN = COX6A2’,97, 10815] [‘0.80’, 5, ‘sp|Q9P0U1|TOM7_HUMAN Mitochondrial importreceptor subunit TOM7 homolog OS = Homo sapiens GN = TOMM7’, 55, 6248]Histones [‘2.70’, 59, ‘sp|P10412|H14_HUMAN Histone H1.4 OS = Homosapiens GN = HIST1H1E’, 219, 21865] [‘2.66’, 60, ‘sp|P16401|H15_HUMANHistone H1.5 OS = Homo sapiens GN = HIST1H1B’, 226, 22580] [‘2.60’, 58,‘sp|P16402|H13_HUMAN Histone H1.3 OS = Homo sapiens GN = HIST1H1D’, 221,22349] [‘2.57’, 55, ‘sp|P16403|H12_HUMAN Histone H1.2 OS = Homo sapiensGN = HIST1H1C’, 213, 21364] [‘2.55’, 53, ‘sp|P07305|H10_HUMAN HistoneH1.0 OS = Homo sapiens GN = H1F0’, 194, 20862] [‘2.47’, 54,‘sp|Q02539|H11_HUMAN Histone H1.1 OS = Homo sapiens GN = HIST1H1A’, 215,21842] [‘2.10’, 46, ‘sp|P22492|H1T_HUMAN Histone H1t OS = Homo sapiensGN = HIST1H1T’, 207, 22018] [‘1.79’, 40, ‘sp|Q92522|H1X_HUMAN HistoneH1x OS = Homo sapiens GN = H1FX’, 213, 22487] [‘1.63’, 42,‘sp|Q75WM6|H1FNT_HUMAN Testis-specific H1 histone OS = Homo sapiens GN =H1FNT’, 234, 25888] [‘1.60’, 18, ‘sp|P62805|H4_HUMAN Histone H4 OS =Homo sapiens GN = HIST1H4A’, 103, 11367] [‘1.56’, 17,‘sp|Q99525|H4G_HUMAN Histone H4-like protein type G OS = Homo sapiens GN= HIST1H4G’, 98, 11009] [‘1.39’, 35, ‘sp|P60008|HILS1_HUMANSpermatid-specific linker histone H1-like protein OS = Homo sapiens GN =HILS1’, 231, 25631] [‘1.32’, 18, ‘sp|Q93079|H2B1H_HUMAN Histone H2B type1-H OS = Homo sapiens GN = HIST1H2BH’, 126, 13892] [‘1.32’, 18,‘sp|O60814|H2B1K_HUMAN Histone H2B type 1-K OS = Homo sapiens GN =HIST1H2BK’, 126, 13890] [‘1.31’, 20, ‘sp|Q71DI3|H32_HUMAN Histone H3.2OS = Homo sapiens GN = HIST2H3A’, 136, 15388] [‘1.31’, 20,‘sp|P84243|H33_HUMAN Histone H3.3 OS = Homo sapiens GN = H3F3A’, 136,15327] [‘1.31’, 20, ‘sp|P68431|H31_HUMAN Histone H3.1 OS = Homo sapiensGN = HIST1H3A’, 136, 15404] [‘1.31’, 18, ‘sp|Q99880|H2B1L_HUMAN HistoneH2B type 1-L OS = Homo sapiens GN = HIST1H2BL’, 126, 13952] [‘1.31’, 18,‘sp|Q99879|H2B1M_HUMAN Histone H2B type 1-M OS = Homo sapiens GN =HIST1H2BM’, 126, 13989] [‘1.31’, 18, ‘sp|Q99877|H2B1N_HUMAN Histone H2Btype 1-N OS = Homo sapiens GN = HIST1H2BN’, 126, 13922] [‘1.31’, 18,‘sp|Q8N257|H2B3B_HUMAN Histone H2B type 3-B OS = Homo sapiens GN =HIST3H2BB’, 126, 13908] [‘1.31’, 18, ‘sp|Q5QNW6|H2B2F_HUMAN Histone H2Btype 2-F OS = Homo sapiens GN = HIST2H2BF’, 126, 13920] [‘1.31’, 18,‘sp|Q16778|H2B2E_HUMAN Histone H2B type 2-E OS = Homo sapiens GN =HIST2H2BE’, 126, 13920] [‘1.31’, 18, ‘sp|P58876|H2B1D_HUMAN Histone H2Btype 1-D OS = Homo sapiens GN = HIST1H2BD’, 126, 13936] [‘1.31’, 18,‘sp|P57053|H2BFS_HUMAN Histone H2B type F-S OS = Homo sapiens GN =H2BFS’, 126, 13944] [‘1.31’, 18, ‘sp|P33778|H2B1B_HUMAN Histone H2B type1-B OS = Homo sapiens GN = HIST1H2BB’, 126, 13950] [‘1.31’, 18,‘sp|P23527|H2B1O_HUMAN Histone H2B type 1-O OS = Homo sapiens GN =HIST1H2BO’, 126, 13906] [‘1.31’, 18, ‘sp|P06899|H2B1J_HUMAN Histone H2Btype 1-J OS = Homo sapiens GN = HIST1H2BJ’, 126, 13904] [‘1.30’, 20,‘sp|Q16695|H31T_HUMAN Histone H3.1t OS = Homo sapiens GN = HIST3H3’,136, 15508] [‘1.29’, 18, ‘sp|Q96A08|H2B1A_HUMAN Histone H2B type 1-A OS= Homo sapiens GN = HIST1H2BA’, 127, 14167] [‘1.28’, 12,‘sp|P05204|HMGN2_HUMAN Non-histone chromosomal protein HMG-17 OS = Homosapiens GN = HMGN2’, 90, 9392] [‘1.24’, 17, ‘sp|Q16777|H2A2C_HUMANHistone H2A type 2-C OS = Homo sapiens GN = HIST2H2AC’, 129, 13988][‘1.23’, 17, ‘sp|Q93077|H2A1C_HUMAN Histone H2A type 1-C OS = Homosapiens GN = HIST1H2AC’, 130, 14105] [‘1.23’, 17, ‘sp|Q7L7L0|H2A3_HUMANHistone H2A type 3 OS = Homo sapiens GN = HIST3H2A’, 130, 14121][‘1.23’, 17, ‘sp|Q6FI13|H2A2A_HUMAN Histone H2A type 2-A OS = Homosapiens GN = HIST2H2AA3’, 130, 14095] [‘1.23’, 17,‘sp|P20671|H2A1D_HUMAN Histone H2A type 1-D OS = Homo sapiens GN =HIST1H2AD’, 130, 14107] [‘1.23’, 17, ‘sp|P0C0S8|H2A1_HUMAN HistoneH17/2A type 1 OS = Homo sapiens GN = HIST1H2AG’, 130, 14091] [‘1.23’,17, ‘sp|P04908|H2A1B_HUMAN Histone H2A type 1-B/E OS = Homo sapiens GN =HIST1H2AB’, 130, 14135] [‘1.19’, 18, ‘sp|Q6NXT2|H3L_HUMAN HistoneH3-like OS = Homo sapiens’, 135, 15213] [‘1.18’, 16,‘sp|Q96KK5|H2A1H_HUMAN Histone H2A type 1-H OS = Homo sapiens GN =HIST1H2AH’, 128, 13906] [‘1.17’, 16, ‘sp|Q99878|H2A1J_HUMAN Histone H2Atype 1-J OS = Homo sapiens GN = HIST1H2AJ’, 128, 13936] [‘1.16’, 16,‘sp|Q8IUE6|H2A2B_HUMAN Histone H2A type 2-B OS = Homo sapiens GN =HIST2H2AB’, 130, 13995] [‘1.09’, 15, ‘sp|Q96QV6|H2A1A_HUMAN Histone H2Atype 1-A OS = Homo sapiens GN = HIST1H2AA’, 131, 14233] [‘1.08’, 16,‘sp|P16104|H2AX_HUMAN Histone H2A.x OS = Homo sapiens GN = H2AFX’, 143,15144] [‘1.08’, 14, ‘sp|Q71UI9|H2AV_HUMAN Histone H2A.V OS = Homosapiens GN = H2AFV’, 128, 13508] [‘1.07’, 14, ‘sp|P0C0S5|H2AZ_HUMANHistone H2A.Z OS = Homo sapiens GN = H2AFZ’, 128, 13552] Ribosome[‘2.87’, 19, ‘sp|P62861|RS30_HUMAN 40S ribosomal protein S30 OS = Homosapiens GN = FAU’, 59, 6647] [‘2.84’, 18, ‘sp|P62891|RL39_HUMAN 60Sribosomal protein L39 OS = Homo sapiens GN = RPL39’, 51, 6406] [‘2.57’,16, ‘sp|Q96EH5|RL39L_HUMAN 60S ribosomal protein L39-like OS = Homosapiens GN = RPL39L’, 51, 6292] [‘2.54’, 28, ‘sp|P61927|RL37_HUMAN 60Sribosomal protein L37 OS = Homo sapiens GN = RPL37’, 97, 11077] [‘2.28’,40, ‘sp|P47914|RL29_HUMAN 60S ribosomal protein L29 OS = Homo sapiens GN= RPL29’, 159, 17752] [‘2.17’, 28, ‘sp|P49207|RL34_HUMAN 60S ribosomalprotein L34 OS = Homo sapiens GN = RPL34’, 117, 13292] [‘2.17’, 27,‘sp|Q969Q0|RL36L_HUMAN 60S ribosomal protein L36a-like OS = Homo sapiensGN = RPL36AL’, 106, 12468] [‘2.17’, 27, ‘sp|P83881|RL36A_HUMAN 60Sribosomal protein L36a OS = Homo sapiens GN = RPL36A’, 106, 12440][‘2.07’, 30, ‘sp|P42766|RL35_HUMAN 60S ribosomal protein L35 OS = Homosapiens GN = RPL35’, 123, 14551] [‘2.07’, 25, ‘sp|Q9Y3U8|RL36_HUMAN 60Sribosomal protein L36 OS = Homo sapiens GN = RPL36’, 105, 12253][‘1.97’, 35, ‘sp|P83731|RL24_HUMAN 60S ribosomal protein L24 OS = Homosapiens GN = RPL24’, 157, 17778] [‘1.92’, 30, ‘sp|P46779|RL28_HUMAN 60Sribosomal protein L28 OS = Homo sapiens GN = RPL28’, 137, 15747][‘1.90’, 44, ‘sp|P84098|RL19_HUMAN 60S ribosomal protein L19 OS = Homosapiens GN = RPL19’, 196, 23465] [‘1.85’, 19, ‘sp|P61513|RL37A_HUMAN 60Sribosomal protein L37a OS = Homo sapiens GN = RPL37A’, 92, 10275][‘1.72’, 37, ‘sp|Q07020|RL18_HUMAN 60S ribosomal protein L18 OS = Homosapiens GN = RPL18’, 188, 21634] [‘1.69’, 22, ‘sp|P62854|RS26_HUMAN 40Sribosomal protein S26 OS = Homo sapiens GN = RPS26’, 115, 13015][‘1.68’, 39, ‘sp|P50914|RL14_HUMAN 60S ribosomal protein L14 OS = Homosapiens GN = RPL14’, 213, 23289] [‘1.66’, 26, ‘sp|P62910|RL32_HUMAN 60Sribosomal protein L32 OS = Homo sapiens GN = RPL32’, 135, 15859][‘1.65’, 39, ‘sp|P61313|RL15_HUMAN 60S ribosomal protein L15 OS = Homosapiens GN = RPL15’, 204, 24146] [‘1.63’, 26, ‘sp|P46776|RL27A_HUMAN 60Sribosomal protein L27a OS = Homo sapiens GN = RPL27A’, 148, 16561][‘1.63’, 19, ‘sp|Q9P0J6|RM36_HUMAN 39S ribosomal protein L36,mitochondrial OS = Homo sapiens GN = MRPL36’, 103, 11784] [‘1.62’, 39,‘sp|P26373|RL13_HUMAN 60S ribosomal protein L13 OS = Homo sapiens GN =RPL13’, 211, 24261] [‘1.61’, 52, ‘sp|Q02878|RL6_HUMAN 60S ribosomalprotein L6 OS = Homo sapiens GN = RPL6’, 288, 32727] [‘1.59’, 25,‘sp|P61353|RL27_HUMAN 60S ribosomal protein L27 OS = Homo sapiens GN =RPL27’, 136, 15797] [‘1.55’, 36, ‘sp|P40429|RL13A_HUMAN 60S ribosomalprotein L13a OS = Homo sapiens GN = RPL13A’, 203, 23577] [‘1.55’, 27,‘sp|P62750|RL23A_HUMAN 60S ribosomal protein L23a OS = Homo sapiens GN =RPL23A’, 156, 17695] [‘1.54’, 33, ‘sp|Q9NZE8|RM35_HUMAN 39S ribosomalprotein L35, mitochondrial OS = Homo sapiens GN = MRPL35’, 188, 21514][‘1.53’, 19, ‘sp|P18077|RL35A_HUMAN 60S ribosomal protein L35a OS = Homosapiens GN = RPL35A’, 110, 12537] [‘1.50’, 71, ‘sp|P36578|RL4_HUMAN 60Sribosomal protein L4 OS = Homo sapiens GN = RPL4’, 427, 47697] [‘1.49’,15, ‘sp|Q9BQ48|RM34_HUMAN 39S ribosomal protein L34, mitochondrial OS =Homo sapiens GN = MRPL34’, 92, 10164] [‘1.48’, 25,‘sp|Q9UNX3|RL26L_HUMAN 60S ribosomal protein L26-like 1 OS = Homosapiens GN = RPL26L1’, 145, 17256] [‘1.48’, 25, ‘sp|P61254|RL26_HUMAN60S ribosomal protein L26 OS = Homo sapiens GN = RPL26’, 145, 17258][‘1.47’, 42, ‘sp|P62753|RS6_HUMAN 40S ribosomal protein S6 OS = Homosapiens GN = RPS6’, 249, 28680] [‘1.46’, 11, ‘sp|P63173|RL38_HUMAN 60Sribosomal protein L38 OS = Homo sapiens GN = RPL38’, 70, 8217] [‘1.45’,11, ‘sp|O75394|RM33_HUMAN 39S ribosomal protein L33, mitochondrial OS =Homo sapiens GN = MRPL33’, 65, 7619] [‘1.41’, 34, ‘sp|P62241|RS8_HUMAN40S ribosomal protein S8 OS = Homo sapiens GN = RPS8’, 208, 24205][‘1.39’, 19, ‘sp|P62851|RS25_HUMAN 40S ribosomal protein S25 OS = Homosapiens GN = RPS25’, 125, 13742] [‘1.38’, 41, ‘sp|P62424|RL7A_HUMAN 60Sribosomal protein L7a OS = Homo sapiens GN = RPL7A’, 266, 29995][‘1.38’, 40, ‘sp|P18124|RL7_HUMAN 60S ribosomal protein L7 OS = Homosapiens GN = RPL7’, 248, 29225] [‘1.38’, 25, ‘sp|P46778|RL21_HUMAN 60Sribosomal protein L21 OS = Homo sapiens GN = RPL21’, 160, 18564][‘1.37’, 28, ‘sp|Q02543|RL18A_HUMAN 60S ribosomal protein L18a OS = Homosapiens GN = RPL18A’, 176, 20762] [‘1.36’, 9, ‘sp|P62273|RS29_HUMAN 40Sribosomal protein S29 OS = Homo sapiens GN = RPS29’, 56, 6676] [‘1.35’,37, ‘sp|P62917|RL8_HUMAN 60S ribosomal protein L8 OS = Homo sapiens GN =RPL8’, 257, 28024] [‘1.35’, 21, ‘sp|P62266|RS23_HUMAN 40S ribosomalprotein S23 OS = Homo sapiens GN = RPS23’, 143, 15807] [‘1.32’, 39,‘sp|O95478|NSA2_HUMAN Ribosome biogenesis protein NSA2 homolog OS = Homosapiens GN = TINP1’, 260, 30065] [‘1.30’, 20, ‘sp|Q86WX3|S19BP_HUMAN 40Sribosomal protein S19-binding protein 1 OS = Homo sapiens GN =RPS19BP1’, 136, 15433] [‘1.28’, 22, ‘sp|Q9BYC9|RM20_HUMAN 39S ribosomalprotein L20, mitochondrial OS = Homo sapiens GN = MRPL20’, 149, 17442][‘1.26’, 23, ‘sp|P62280|RS11_HUMAN 40S ribosomal protein S11 OS = Homosapiens GN = RPS11’, 158, 18430] [‘1.21’, 18, ‘sp|Q4U2R6|RM51_HUMAN 39Sribosomal protein L51, mitochondrial OS = Homo sapiens GN = MRPL51’,128, 15094] [‘1.19’, 20, ‘sp|P62277|RS13_HUMAN 40S ribosomal protein S13OS = Homo sapiens GN = RPS13’, 151, 17222] [‘1.19’, 17,‘sp|P62899|RL31_HUMAN 60S ribosomal protein L31 OS = Homo sapiens GN =RPL31’, 125, 14462] [‘1.16’, 20, ‘sp|P62269|RS18_HUMAN 40S ribosomalprotein S18 OS = Homo sapiens GN = RPS18’, 152, 17718] [‘1.14’, 17,‘sp|P62829|RL23_HUMAN 60S ribosomal protein L23 OS = Homo sapiens GN =RPL23’, 140, 14865] [‘1.12’, 33, ‘sp|P82914|RT15_HUMAN 28S ribosomalprotein S15, mitochondrial OS = Homo sapiens GN = MRPS15’, 257, 29842][‘1.10’, 51, ‘sp|Q92901|RL3L_HUMAN 60S ribosomal protein L3-like OS =Homo sapiens GN = RPL3L’, 407, 46295] [‘1.10’, 18, ‘sp|P62249|RS16_HUMAN40S ribosomal protein S16 OS = Homo sapiens GN = RPS16’, 146, 16445][‘1.09’, 23, ‘sp|P18621|RL17_HUMAN 60S ribosomal protein L17 OS = Homosapiens GN = RPL17’, 184, 21397] [‘1.07’, 21, ‘sp|Q9UHA3|RLP24_HUMANProbable ribosome biogenesis protein RLP24 OS = Homo sapiens GN =C15orf15’, 163, 19621] [‘1.07’, 16, ‘sp|O60783|RT14_HUMAN 28S ribosomalprotein S14, mitochondrial OS = Homo sapiens GN = MRPS14’, 128, 15138][‘1.06’, 16, ‘sp|O15235|RT12_HUMAN 28S ribosomal protein S12,mitochondrial OS = Homo sapiens GN = MRPS12’, 138, 15172] [‘1.05’, 48,‘sp|P39023|RL3_HUMAN 60S ribosomal protein L3 OS = Homo sapiens GN =RPL3’, 403, 46108] [‘1.03’, 25, ‘sp|P27635|RL10_HUMAN 60S ribosomalprotein L10 OS = Homo sapiens GN = RPL10’, 214, 24603] [‘1.03’, 16,‘sp|Q9P0M9|RM27_HUMAN 39S ribosomal protein L27, mitochondrial OS = Homosapiens GN = MRPL27’, 148, 16072] [‘1.03’, 11, ‘sp|P82921|RT21_HUMAN 28Sribosomal protein S21, mitochondrial OS = Homo sapiens GN = MRPS21’, 87,10741] [‘1.02’, 12, ‘sp|Q9BQC6|RT63_HUMAN Ribosomal protein 63,mitochondrial OS = Homo sapiens GN = MRP63’, 102, 12266] [‘1.00’, 28,‘sp|Q6DKI1|RL7L_HUMAN 60S ribosomal protein L7-like 1 OS = Homo sapiensGN = RPL7L1’, 246, 28660] [‘0.99’, 22, ‘sp|P46781|RS9_HUMAN 40Sribosomal protein S9 OS = Homo sapiens GN = RPS9’, 194, 22591] [‘0.98’,53, ‘sp|O76021|RL1D1_HUMAN Ribosomal L1 domain-containing protein 1 OS =Homo sapiens GN = RSL1D1’, 490, 54972] [‘0.97’, 32,‘sp|Q5T653|RM02_HUMAN 39S ribosomal protein L2, mitochondrial OS = Homosapiens GN = MRPL2’, 305, 33300] [‘0.96’, 23, ‘sp|Q96L21|RL10L_HUMAN 60Sribosomal protein L10-like OS = Homo sapiens GN = RPL10L’, 214, 24518][‘0.96’, 21, ‘sp|Q9NVS2|RT18A_HUMAN 28S ribosomal protein S18a,mitochondrial OS = Homo sapiens GN = MRPS18A’, 196, 22183] [‘0.96’, 9,‘sp|Q71UM5|RS27L_HUMAN 40S ribosomal protein S27-like protein OS = Homosapiens GN = RPS27L’, 84, 9477] [‘0.96’, 9, ‘sp|P42677|RS27_HUMAN 40Sribosomal protein S27 OS = Homo sapiens GN = RPS27’, 84, 9461] [‘0.93’,38, ‘sp|Q15050|RRS1_HUMAN Ribosome biogenesis regulatory protein homologOS = Homo sapiens GN = RRS1’, 365, 41193] [‘0.90’, 14,‘sp|Q6P1L8|RM14_HUMAN 39S ribosomal protein L14, mitochondrial OS = Homosapiens GN = MRPL14’, 145, 15947] [‘0.90’, 14, ‘sp|P39019|RS19_HUMAN 40Sribosomal protein S19 OS = Homo sapiens GN = RPS19’, 145, 16060][‘0.87’, 25, ‘sp|Q9HD33|RM47_HUMAN 39S ribosomal protein L47,mitochondrial OS = Homo sapiens GN = MRPL47’, 252, 29577] [‘0.86’, 21,‘sp|P62906|RL10A_HUMAN 60S ribosomal protein L10a OS = Homo sapiens GN =RPL10A’, 217, 24831] [‘0.84’, 26, ‘sp|P15880|RS2_HUMAN 40S ribosomalprotein S2 OS = Homo sapiens GN = RPS2’, 293, 31324] [‘0.83’, 13,‘sp|Q9Y3D5|RT18C_HUMAN 28S ribosomal protein S18c, mitochondrial OS =Homo sapiens GN = MRPS18C’, 142, 15849] RS Domain [‘1.74’, 44,‘sp|Q01130|SFRS2_HUMAN Splicing factor, arginine/serine-rich 2 OS = Homosapiens GN = SFRS2’, 221, 25476] [‘1.66’, 93, ‘sp|Q08170|SFRS4_HUMANSplicing factor, arginine/serine-rich 4 OS = Homo sapiens GN = SFRS4’,494, 56678] [‘1.35’, 26, ‘sp|P84103|SFRS3_HUMAN Splicing factor,arginine/serine-rich 3 OS = Homo sapiens GN = SFRS3’, 164, 19329][‘0.91’, 48, ‘sp|Q05519|SFR11_HUMAN Splicing factor arginine/serine-rich11 OS = Homo sapiens GN = SFRS11’, 484, 53542] Isoforms [‘2.10’, 36,‘sp|Q8N2M8-2|SFR16_HUMAN Isoform 2 of Splicing factor,arginine/serine-rich 16 OS = Homo sapiens GN = SFRS16’, 159, 17218][‘1.96’, 41, ‘sp|Q8IZA3-2|H1FOO_HUMAN Isoform 2 of Histone H1oo OS =Homo sapiens GN = H1FOO’, 207, 21010] [‘1.93’, 51,‘sp|Q9BUV0-3|CA063_HUMAN Isoform 3 of UPF0471 protein C1orf63 OS = Homosapiens GN = C1orf63’, 226, 26604] [‘1.93’, 10, ‘sp|Q9Y5P2-3|CSAG2_HUMANIsoform 3 of Chondrosarcoma-associated gene 2/3A protein OS = Homosapiens GN = CSAG2’, 48, 5216] [‘1.87’, 28, ‘sp|Q8NAV1-2|PR38A_HUMANIsoform 2 of Pre-mRNA-splicing factor 38A OS = Homo sapiens GN =PRPF38A’, 125, 15462] [‘1.83’, 10, ‘sp|Q32NB8-4|PGPS1_HUMAN Isoform 4 ofCDP-diacylglycerol--glycerol-3-phosphate 3-phosphatidyltransferase,mitochondrial OS = Homo sapiens GN = PGS1’, 50, 5463] [‘1.77’, 50,‘sp|Q9BUV0-2|CA063_HUMAN Isoform 2 of UPF0471 protein C1orf63 OS = Homosapiens GN = C1orf63’, 242, 28363] [‘1.74’, 30, ‘sp|P49760-2|CLK2_HUMANIsoform Short of Dual specificity protein kinase CLK2 OS = Homo sapiensGN = CLK2’, 139, 17569] [‘1.68’, 46, ‘sp|Q16629-1|SFRS7_HUMAN Isoform 1of Splicing factor, arginine/serine-rich 7 OS = Homo sapiens GN =SFRS7’, 238, 27366] [‘1.68’, 25, ‘sp|P62847-2|RS24_HUMAN Isoform 2 of40S ribosomal protein S24 OS = Homo sapiens GN = RPS24’, 130, 15068][‘1.66’, 59, ‘sp|Q8IZA3-1|H1FOO_HUMAN Isoform 1 of Histone H1oo OS =Homo sapiens GN = H1FOO’, 346, 35813] [‘1.66’, 53,‘sp|Q9BRL6-1|SFR2B_HUMAN Isoform 1 of Splicing factor,arginine/serine-rich 2B OS = Homo sapiens GN = SFRS2B’, 282, 32287][‘1.65’, 25, ‘sp|P62847-1|RS24_HUMAN Isoform 1 of 40S ribosomal proteinS24 OS = Homo sapiens GN = RPS24’, 133, 15423] [‘1.61’, 54,‘sp|Q9BUV0-1|CA063_HUMAN Isoform 1 of UPF0471 protein C1orf63 OS = Homosapiens GN = C1orf63’, 290, 33613] [‘1.61’, 50, ‘sp|Q9BRL6-2|SFR2B_HUMANIsoform 2 of Splicing factor, arginine/serine-rich 2B OS = Homo sapiensGN = SFRS2B’, 275, 31424] [‘1.61’, 6, ‘sp|Q92876-3|KLK6_HUMAN Isoform 3of Kallikrein-6 OS = Homo sapiens GN = KLK6’, 40, 4333] [‘1.60’, 54,‘sp|Q15287-1|RNPS1_HUMAN Isoform 1 of RNA-binding protein withserine-rich domain 1 OS = Homo sapiens GN = RNPS1’, 305, 34208] [‘1.58’,32, ‘sp|Q13875-2|MOBP_HUMAN Isoform 2 of Myelin-associatedoligodendrocyte basic protein OS = Homo sapiens GN = MOBP’, 182, 20772][‘1.57’, 49, ‘sp|Q15287-2|RNPS1_HUMAN Isoform 2 of RNA-binding proteinwith serine-rich domain 1 OS = Homo sapiens GN = RNPS1’, 282, 31709][‘1.57’, 32, ‘sp|Q13875-1|MOBP_HUMAN Isoform 1 of Myelin-associatedoligodendrocyte basic protein OS = Homo sapiens GN = MOBP’, 183, 20959][‘1.56’, 50, ‘sp|Q66PJ3-5|AR6P4_HUMAN Isoform 5 of ADP-ribosylationfactor-like protein 6-interacting protein 4 OS = Homo sapiens GN =ARL6IP4’, 304, 32178] [‘1.55’, 44, ‘sp|Q9HB58-4|SP110_HUMAN Isoform 4 ofSp110 nuclear body protein OS = Homo sapiens GN = SP110’, 248, 28609][‘1.54’, 33, ‘sp|Q66PJ3-6|AR6P4_HUMAN Isoform 6 of ADP-ribosylationfactor-like protein 6-interacting protein 4 OS = Homo sapiens GN =ARL6IP4’, 215, 22007] [‘1.51’, 28, ‘sp|P49761-2|CLK3_HUMAN Isoform 2 ofDual specificity protein kinase CLK3 OS = Homo sapiens GN = CLK3’, 152,18971] [‘1.44’, 18, ‘sp|Q14CB8-4|RHG19_HUMAN Isoform 4 of RhoGTPase-activating protein 19 OS = Homo sapiens GN = ARHGAP19’, 112,12547] [‘1.44’, 13, ‘sp|Q13875-3|MOBP_HUMAN Isoform 3 ofMyelin-associated oligodendrocyte basic protein OS = Homo sapiens GN =MOBP’, 81, 9614] [‘1.43’, 44, ‘sp|O75494-2|FUSIP_HUMAN Isoform 2 ofFUS-interacting serine-arginine-rich protein 1 OS = Homo sapiens GN =FUSIP1’, 261, 31213] [‘1.43’, 12, ‘sp|Q15651-2|HMGN3_HUMAN Isoform 2 ofHigh mobility group nucleosome-binding domain-containing protein 3 OS =Homo sapiens GN = HMGN3’, 77, 8377] [‘1.42’, 56,‘sp|Q13247-1|SFRS6_HUMAN Isoform SRP55-1 of Splicing factor,arginine/serine-rich 6 OS = Homo sapiens GN = SFRS6’, 344, 39586][‘1.42’, 44, ‘sp|O75494-1|FUSIP_HUMAN Isoform 1 of FUS-interactingserine-arginine-rich protein 1 OS = Homo sapiens GN = FUSIP1’, 262,31300] [‘1.42’, 8, ‘sp|Q70YC5-5|ZN365_HUMAN Isoform 6 of Protein ZNF365OS = Homo sapiens GN = ZNF365’, 51, 5653] [‘1.41’, 48,‘sp|Q9UK58-3|CCNL1_HUMAN Isoform 3 of Cyclin-L1 OS = Homo sapiens GN =CCNL1’, 299, 34688] [‘1.41’, 9, ‘sp|Q2NKX9-2|CB068_HUMAN Isoform 2 ofUPF0561 protein C2orf68 OS = Homo sapiens GN = C2orf68’, 58, 6747][‘1.39’, 25, ‘sp|Q66K41-2|Z385C_HUMAN Isoform 2 of Zinc finger protein385C OS = Homo sapiens GN = ZNF385C’, 174, 18242] [‘1.38’, 10,‘sp|Q9UQ07-3|MOK_HUMAN Isoform 3 of MAPK/MAK/MRK overlapping kinase OS =Homo sapiens GN = RAGE’, 73, 7879] [‘1.37’, 42, ‘sp|Q13243-3|SFRS5_HUMANIsoform SRP40-4 of Splicing factor, arginine/serine-rich 5 OS = Homosapiens GN = SFRS5’, 269, 30858] [‘1.36’, 23, ‘sp|Q6PGN9-4|PSRC1_HUMANIsoform D of Proline/serine-rich coiled-coil protein 1 OS = Homo sapiensGN = PSRC1’, 163, 16980] [‘1.36’, 15, ‘sp|Q6P1Q0-6|LTMD1_HUMAN Isoform 6of LETM1 domain-containing protein 1 OS = Homo sapiens GN = LETMD1’, 99,11221] [‘1.36’, 10, ‘sp|O75920-2|SERF1_HUMAN Isoform Short of SmallEDRK-rich factor 1 OS = Homo sapiens GN = SERF1A’, 62, 7336] [‘1.35’,68, ‘sp|Q7L4I2-1|RSRC2_HUMAN Isoform 1 of Arginine/serine-richcoiled-coil protein 2 OS = Homo sapiens GN = RSRC2’, 434, 50559][‘1.35’, 31, ‘sp|Q96HZ4-2|HES6_HUMAN Isoform 2 of Transcription cofactorHES-6 OS = Homo sapiens GN = HES6’, 214, 23483] [‘1.35’, 24,‘sp|Q8N726-1|CD2A2_HUMAN Isoform 4 of Cyclin-dependent kinase inhibitor2A, isoform 4 OS = Homo sapiens GN = CDKN2A’, 173, 18005] [‘1.35’, 11,‘sp|Q5JUX0-2|SPIN3_HUMAN Isoform 2 of Spindlin-3 OS = Homo sapiens GN =SPIN3’, 77, 8415] [‘1.34’, 17, ‘sp|P49450-2|CENPA_HUMAN Isoform 2 ofHistone H3-like centromeric protein A OS = Homo sapiens GN = CENPA’,114, 13001] [‘1.31’, 58, ‘sp|Q7L4I2-2|RSRC2_HUMAN Isoform 2 ofArginine/serine-rich coiled-coil protein 2 OS = Homo sapiens GN =RSRC2’, 386, 44878] [‘1.29’, 40, ‘sp|Q13243-1|SFRS5_HUMAN IsoformSRP40-1 of Splicing factor, arginine/serine-rich 5 OS = Homo sapiens GN= SFRS5’, 272, 31263] [‘1.28’, 47, ‘sp|Q9UK58-2|CCNL1_HUMAN Isoform 2 ofCyclin-L1 OS = Homo sapiens GN = CCNL1’, 320, 37273] [‘1.28’, 15,‘sp|Q66K41-3|Z385C_HUMAN Isoform 3 of Zinc finger protein 385C OS = Homosapiens GN = ZNF385C’, 114, 11856] [‘1.25’, 35, ‘sp|Q5BKY9-1|F133B_HUMANIsoform 1 of Protein FAM133B OS = Homo sapiens GN = FAM133B’, 247,28385] [‘1.25’, 9, ‘sp|Q86SI9-3|CEI_HUMAN Isoform 3 of Protein CEI OS =Homo sapiens GN = C5orf38’, 70, 7333] [‘1.24’, 47,‘sp|Q96IZ7-1|RSRC1_HUMAN Isoform 1 of Arginine/serine-rich coiled-coilprotein 1 OS = Homo sapiens GN = RSRC1’, 334, 38677] [‘1.24’, 41,‘sp|P62995-1|TRA2B_HUMAN Isoform 1 of Splicing factor,arginine/serine-rich 10 OS = Homo sapiens GN = SFRS10’, 288, 33665][‘1.24’, 30, ‘sp|Q86SI9-2|CEI_HUMAN Isoform 2 of Protein CEI OS = Homosapiens GN = C5orf38’, 226, 24375] [‘1.24’, 17, ‘sp|Q9HC23-1|PROK2_HUMANIsoform 1 of Prokineticin-2 OS = Homo sapiens GN = PROK2’, 129, 14314][‘1.23’, 41, ‘sp|Q96S94-3|CCNL2_HUMAN Isoform 3 of Cyclin-L2 OS = Homosapiens GN = CCNL2’, 298, 33839] [‘1.23’, 33, ‘sp|Q5BKY9-2|F133B_HUMANIsoform 2 of Protein FAM133B OS = Homo sapiens GN = FAM133B’, 237,27193] [‘1.23’, 17, ‘sp|Q9BTM1-1|H2AJ_HUMAN Isoform 1 of Histone H2A.JOS = Homo sapiens GN = H2AFJ’, 129, 14019] [‘1.22’, 44,‘sp|Q66PJ3-4|AR6P4_HUMAN Isoform 4 of ADP-ribosylation factor-likeprotein 6-interacting protein 4 OS = Homo sapiens GN = ARL6IP4’, 338,36210] [‘1.22’, 11, ‘sp|Q8TEW8-4|PAR3L_HUMAN Isoform 4 ofPartitioning-defective 3 homolog B OS = Homo sapiens GN = PARD3B’, 79,9007] [‘1.21’, 46, ‘sp|Q13247-3|SFRS6_HUMAN Isoform SRP55-3 of Splicingfactor, arginine/serine-rich 6 OS = Homo sapiens GN = SFRS6’, 335,38418] [‘1.21’, 44, ‘sp|Q66PJ3-3|AR6P4_HUMAN Isoform 3 ofADP-ribosylation factor-like protein 6-interacting protein 4 OS = Homosapiens GN = ARL6IP4’, 341, 36612] [‘1.20’, 45, ‘sp|Q66PJ3-2|AR6P4_HUMANIsoform 2 of ADP-ribosylation factor-like protein 6-interacting protein4 OS = Homo sapiens GN = ARL6IP4’, 352, 37638] [‘1.20’, 12,‘sp|Q8N6C7-2|PGSF1_HUMAN Isoform 2 of Pituitary gland-specific factor 1OS = Homo sapiens GN = PGSF1’, 91, 10048] [‘1.19’, 38,‘sp|Q13595-1|TRA2A_HUMAN Isoform Long of Transformer-2 protein homologOS = Homo sapiens GN = TRA2A’, 282, 32688] [‘1.17’, 45,‘sp|Q66PJ3-1|AR6P4_HUMAN Isoform 1 of ADP-ribosylation factor-likeprotein 6-interacting protein 4 OS = Homo sapiens GN = ARL6IP4’, 360,38395] [‘1.17’, 12, ‘sp|O75365-3|TP4A3_HUMAN Isoform 3 of Proteintyrosine phosphatase type IVA 3 OS = Homo sapiens GN = PTP4A3’, 87,10494] [‘1.16’, 24, ‘sp|P02686-3|MBP_HUMAN Isoform 3 of Myelin basicprotein OS = Homo sapiens GN = MBP’, 197, 21493] [‘1.15’, 22,‘sp|P17096-3|HMGA1_HUMAN Isoform HMG-R of High mobility group proteinHMG-I/HMG-Y OS = Homo sapiens GN = HMGA1’, 179, 19694] [‘1.15’, 7,‘sp|Q8IU53-2|CASC2_HUMAN Isoform 2 of Protein CASC2, isoforms 1/2 OS =Homo sapiens GN = CASC2’, 55, 6154] [‘1.14’, 13,‘sp|P31260-2|HXA10_HUMAN Isoform 2 of Homeobox protein Hox-A10 OS = Homosapiens GN = HOXA10’, 94, 11452] [‘1.14’, 12, ‘sp|Q9NZQ0-2|RABJ_HUMANIsoform 2 of Rab and DnaJ domain-containing protein OS = Homo sapiens GN= RBJ’, 90, 10621] [‘1.14’, 10, ‘sp|Q8IVJ8-2|APRG1_HUMAN Isoform 2 ofAP20 region protein 1 OS = Homo sapiens GN = APRG1’, 78, 8910] [‘1.14’,9, ‘sp|Q6QHF9-10|PAOX_HUMAN Isoform 12 of PeroxisomalN(1)-acetyl-spermine/spermidine oxidase OS = Homo sapiens GN = PAOX’,83, 8694] [‘1.14’, 9, ‘sp|P02686-7|MBP_HUMAN Isoform 7 of Myelin basicprotein OS = Homo sapiens GN = MBP’, 74, 8265] [‘1.13’, 38,‘sp|Q9UQ35-3|SRRM2_HUMAN Isoform 3 of Serine/arginine repetitive matrixprotein 2 OS = Homo sapiens GN = SRRM2’, 311, 34212] [‘1.13’, 22,‘sp|P02686-4|MBP_HUMAN Isoform 4 of Myelin basic protein OS = Homosapiens GN = MBP’, 186, 20245] [‘1.13’, 20, ‘sp|P02686-5|MBP_HUMANIsoform 5 of Myelin basic protein OS = Homo sapiens GN = MBP’, 171,18590] [‘1.13’, 12, ‘sp|P17096-2|HMGA1_HUMAN Isoform HMG-Y of Highmobility group protein HMG-I/HMG-Y OS = Homo sapiens GN = HMGA1’, 96,10678] [‘1.12’, 24, ‘sp|Q5HYI7-3|MTX3_HUMAN Isoform 3 of Metaxin-3 OS =Homo sapiens GN = MTX3’, 201, 22355] [‘1.11’, 31,‘sp|Q9GZR2-2|REXO4_HUMAN Isoform 2 of RNA exonuclease 4 OS = Homosapiens GN = REXO4’, 250, 28390] [‘1.11’, 8, ‘sp|Q6H9L7-4|TAIL1_HUMANIsoform 4 of Thrombospondin and AMOP domain-containing isthmin-likeprotein 1 OS = Homo sapiens GN = THSD3’, 76, 7995] [‘1.10’, 20,‘sp|Q15170-1|TCAL1_HUMAN Isoform 1 of Transcription elongation factor Aprotein-like 1 OS = Homo sapiens GN = TCEAL1’, 157, 18354] [‘1.10’, 11,‘sp|Q6ZUS6-3|CC149_HUMAN Isoform 3 of Coiled-coil domain-containingprotein 149 OS = Homo sapiens GN = CCDC149’, 86, 10164] [‘1.10’, 7,‘sp|Q70UQ0-3|IKIP_HUMAN Isoform 3 of Inhibitor of nuclear factor kappa-Bkinase-interacting protein OS = Homo sapiens GN = IKIP’, 70, 7141][‘1.09’, 18, ‘sp|P02686-6|MBP_HUMAN Isoform 6 of Myelin basic protein OS= Homo sapiens GN = MBP’, 160, 17343] [‘1.09’, 17,‘sp|P49450-1|CENPA_HUMAN Isoform 1 of Histone H3-like centromericprotein A OS = Homo sapiens GN = CENPA’, 140, 15990] [‘1.08’, 13,‘sp|Q8WWL7-3|CCNB3_HUMAN Isoform 3 of G2/mitotic-specific cyclin-B3 OS =Homo sapiens GN = CCNB3’, 111, 12195] [‘1.07’, 15,‘sp|Q2NKX9-3|CB068_HUMAN Isoform 3 of UPF0561 protein C2orf68 OS = Homosapiens GN = C2orf68’, 127, 14480] [‘1.07’, 10, ‘sp|Q8IUX4-2|ABC3F_HUMANIsoform 2 of DNA dC->dU-editing enzyme APOBEC-3F OS = Homo sapiens GN =APOBEC3F’, 79, 9444] [‘1.06’, 9, ‘sp|Q8IU53-1|CASC2_HUMAN Isoform 1 ofProtein CASC2, isoforms 1/2 OS = Homo sapiens GN = CASC2’, 76, 8607][‘1.06’, 8, ‘sp|Q9UBR5-3|CKLF_HUMAN Isoform CKLF3 of Chemokine-likefactor OS = Homo sapiens GN = CKLF’, 67, 7652] [‘1.05’, 20,‘sp|Q2I0M5-2|RSPO4_HUMAN Isoform 2 of R-spondin-4 OS = Homo sapiens GN =RSPO4’, 172, 19606] [‘1.05’, 8, ‘sp|Q9NPS7-2|F41CL_HUMAN Isoform 2 ofProtein FAM41C-like OS = Homo sapiens’, 63, 7681] [‘1.05’, 6,‘sp|O75460-2|ERN1_HUMAN Isoform 2 of Serine/threonine-proteinkinase/endoribonuclease IRE1 OS = Homo sapiens GN = ERN1’, 70, 6648][‘1.04’, 46, ‘sp|Q5SSJ5-3|HP1B3_HUMAN Isoform 3 of Heterochromatinprotein 1-binding protein 3 OS = Homo sapiens GN = HP1BP3’, 401, 44434][‘1.04’, 18, ‘sp|Q15973-2|ZN124_HUMAN Isoform 4 of Zinc finger protein124 OS = Homo sapiens GN = ZNF124’, 156, 17830] [‘1.04’, 8,‘sp|Q9NPS7-1|F41CL_HUMAN Isoform 1 of Protein FAM41C-like OS = Homosapiens’, 64, 7809] [‘1.03’, 90, ‘sp|Q13427-1|PPIG_HUMAN Isoform 1 ofPeptidyl-prolyl cis-trans isomerase G OS = Homo sapiens GN = PPIG’, 754,88618] [‘1.03’, 29, ‘sp|Q9BRU9-1|UTP23_HUMAN Isoform 1 ofrRNA-processing protein UTP23 homolog OS = Homo sapiens GN = UTP23’,249, 28430] [‘1.03’, 18, ‘sp|Q6PH81-1|CP087_HUMAN Isoform 1 of UPF0547protein C16orf87 OS = Homo sapiens GN = C16orf87’, 154, 17799] [‘1.03’,17, ‘sp|Q7Z6I8-2|CE024_HUMAN Isoform 2 of UPF0461 protein C5orf24 OS =Homo sapiens GN = C5orf24’, 155, 16724] [‘1.03’, 17,‘sp|P49759-2|CLK1_HUMAN Isoform Short of Dual specificity protein kinaseCLK1 OS = Homo sapiens GN = CLK1’, 136, 16570] [‘1.03’, 13,‘sp|Q8NG50-4|RDM1_HUMAN Isoform 4 of RAD52 motif-containing protein 1 OS= Homo sapiens GN = RDM1’, 116, 13173] [‘1.03’, 12,‘sp|P17096-1|HMGA1_HUMAN Isoform HMG-I of High mobility group proteinHMG-I/HMG-Y OS = Homo sapiens GN = HMGA1’, 107, 11676] [‘1.03’, 10,‘sp|P48061-1|SDF1_HUMAN Isoform Beta of Stromal cell-derived factor 1 OS= Homo sapiens GN = CXCL12’, 93, 10665] [‘1.02’, 17,‘sp|P82912-3|RT11_HUMAN Isoform 3 of 28S ribosomal protein S11,mitochondrial OS = Homo sapiens GN = MRPS11’, 161, 16903] [‘1.02’, 15,‘sp|Q8N1T3-2|MYO1H_HUMAN Isoform 2 of Myosin-Ih OS = Homo sapiens GN =MYO1H’, 127, 14805] [‘1.02’, 10, ‘sp|Q9NZ81-2|PRR13_HUMAN Isoform 2 ofProline-rich protein 13 OS = Homo sapiens GN = PRR13’, 98, 10531][‘1.02’, 7, ‘sp|Q9Y2A0-3|TPAP1_HUMAN Isoform 3 of p53-activated protein1 OS = Homo sapiens GN = TP53AP1’, 60, 6937] [‘1.01’, 32,‘sp|Q9UBB5-3|MBD2_HUMAN Isoform 3 of Methyl-CpG-binding domain protein 2OS = Homo sapiens GN = MBD2’, 302, 31744] [‘1.01’, 19,‘sp|Q9NWS8-4|RMND1_HUMAN Isoform 4 of Required for meiotic nucleardivision protein 1 homolog OS = Homo sapiens GN = RMND1’, 170, 19360][‘1.01’, 17, ‘sp|Q9H2U2-5|IPYR2_HUMAN Isoform 5 of Inorganicpyrophosphatase 2, mitochondrial OS = Homo sapiens GN = PPA2’, 157,16961] [‘1.01’, 13, ‘sp|P08949-1|NMB_HUMAN Isoform 1 of Neuromedin-B OS= Homo sapiens GN = NMB’, 121, 13255] [‘1.00’, 37,‘sp|Q09FC8-3|ZN415_HUMAN Isoform 3 of Zinc finger protein 415 OS = Homosapiens GN = ZNF415’, 325, 37237] [‘1.00’, 35, ‘sp|Q6ZN11-2|ZN793_HUMANIsoform 2 of Zinc finger protein 793 OS = Homo sapiens GN = ZNF793’,312, 35909] [‘1.00’, 31, ‘sp|Q96IZ7-2|RSRC1_HUMAN Isoform 2 ofArginine/serine-rich coiled-coil protein 1 OS = Homo sapiens GN =RSRC1’, 276, 31528] [‘1.00’, 8, ‘sp|Q7Z4H3-3|HDDC2_HUMAN Isoform 3 of HDdomain-containing protein 2 OS = Homo sapiens GN = HDDC2’, 71, 8163][‘0.99’, 10, ‘sp|P56134-2|ATPK_HUMAN Isoform 2 of ATP synthase subunitf, mitochondrial OS = Homo sapiens GN = ATP5J2’, 88, 10363] [‘0.98’, 50,‘sp|Q3SXZ3-2|ZN718_HUMAN Isoform 2 of Zinc finger protein 718 OS = Homosapiens GN = ZNF718’, 446, 51561] [‘0.98’, 35, ‘sp|Q8IXZ2-2|ZC3H3_HUMANIsoform 2 of Zinc finger CCCH domain-containing protein 3 OS = Homosapiens GN = ZC3H3’, 335, 35929] [‘0.98’, 24, ‘sp|Q9NP64-2|NO40_HUMANIsoform 2 of Nucleolar protein of 40 kDa OS = Homo sapiens GN =ZCCHC17’, 217, 24918] [‘0.97’, 48, ‘sp|Q499Z4-1|ZN672_HUMAN Isoform 1 ofZinc finger protein 672 OS = Homo sapiens GN = ZNF672’, 452, 50224][‘0.97’, 11, ‘sp|P10747-2|CD28_HUMAN Isoform 2 of T-cell-specificsurface glycoprotein CD28 OS = Homo sapiens GN = CD28’, 101, 11527][‘0.97’, 9, ‘sp|Q9HC16-3|ABC3G_HUMAN Isoform 3 of DNA dC->dU-editingenzyme APOBEC-3G OS = Homo sapiens GN = APOBEC3G’, 79, 9385] [‘0.97’, 5,‘sp|Q16517-2|NNAT_HUMAN Isoform Beta of Neuronatin OS = Homo sapiens GN= NNAT’, 54, 6153] [‘0.97’, 4, ‘sp|Q96T75-4|DSCR8_HUMAN Isoform 4 ofDown syndrome critical region protein 8 OS = Homo sapiens GN = DSCR8’,37, 4295] [‘0.96’, 61, ‘sp|Q5VTL8-1|PR38B_HUMAN Isoform 1 ofPre-mRNA-splicing factor 38B OS = Homo sapiens GN = PRPF38B’, 546,64467] [‘0.96’, 14, ‘sp|Q8TCC3-3|RM30_HUMAN Isoform 3 of 39S ribosomalprotein L30, mitochondrial OS = Homo sapiens GN = MRPL30’, 131, 15190][‘0.95’, 21, ‘sp|Q9NY12-1|NOLA1_HUMAN Isoform 1 of H/ACAribonucleoprotein complex subunit 1 OS = Homo sapiens GN = NOLA1’, 217,22347] [‘0.95’, 14, ‘sp|Q7Z7F7-1|RM55_HUMAN Isoform 1 of 39S ribosomalprotein L55, mitochondrial OS = Homo sapiens GN = MRPL55’, 128, 15128][‘0.95’, 14, ‘sp|Q7Z422-4|CA144_HUMAN Isoform 4 of UPF0485 proteinC1orf144 OS = Homo sapiens GN = C1orf144’, 133, 14760] [‘0.95’, 11,‘sp|Q2T9K0-3|TMM44_HUMAN Isoform 3 of Transmembrane protein 44 OS = Homosapiens GN = TMEM44’, 113, 12491] [‘0.94’, 70, ‘sp|Q8NDQ6-4|ZN540_HUMANIsoform 4 of Zinc finger protein 540 OS = Homo sapiens GN = ZNF540’,637, 74992] [‘0.94’, 56, ‘sp|Q8WXA9-1|SFR12_HUMAN Isoform 1 of Splicingfactor, arginine/serine-rich 12 OS = Homo sapiens GN = SFRS12’, 508,59380] [‘0.94’, 43, ‘sp|Q3MIS6-2|ZN528_HUMAN Isoform 2 of Zinc fingerprotein 528 OS = Homo sapiens GN = ZNF528’, 395, 45715] [‘0.94’, 22,‘sp|O60258-2|FGF17_HUMAN Isoform 2 of Fibroblast growth factor 17 OS =Homo sapiens GN = FGF17’, 205, 23669] [‘0.94’, 10,‘sp|Q9BU19-4|ZN692_HUMAN Isoform 4 of Zinc finger protein 692 OS = Homosapiens GN = ZNF692’, 96, 10818] [‘0.93’, 27, ‘sp|Q6P1L5-2|AL2SC_HUMANIsoform 2 of Amyotrophic lateral sclerosis 2 chromosomal regioncandidate gene 13 protein OS = Homo sapiens GN = ALS2CR13’, 289, 29427][‘0.93’, 27, ‘sp|P12034-1|FGF5_HUMAN Isoform Long of Fibroblast growthfactor 5 OS = Homo sapiens GN = FGF5’, 268, 29550] [‘0.92’, 89,‘sp|Q8N4W9-2|ZN808_HUMAN Isoform 2 of Zinc finger protein 808 OS = Homosapiens GN = ZNF808’, 834, 96803] [‘0.92’, 20, ‘sp|Q5T4W7-1|ARTN_HUMANIsoform 1 of Artemin OS = Homo sapiens GN = ARTN’, 220, 22878] [‘0.92’,15, ‘sp|O15444-1|CCL25_HUMAN Isoform 1 of C-C motif chemokine 25 OS =Homo sapiens GN = CCL25’, 150, 16609] [‘0.92’, 12,‘sp|Q8IVJ8-3|APRG1_HUMAN Isoform 3 of AP20 region protein 1 OS = Homosapiens GN = APRG1’, 119, 13172] [‘0.91’, 67, ‘sp|Q8NDQ6-2|ZN540_HUMANIsoform 2 of Zinc finger protein 540 OS = Homo sapiens GN = ZNF540’,628, 73708] [‘0.91’, 19, ‘sp|P05019-1|IGF1B_HUMAN Isoform IGF-IB ofInsulin-like growth factor IB OS = Homo sapiens GN = IGF1’, 195, 21841][‘0.91’, 14, ‘sp|O60565-2|GREM1_HUMAN Isoform 2 of Gremlin-1 OS = Homosapiens GN = GREM1’, 143, 16292] [‘0.91’, 12, ‘sp|Q96A00-2|PP14A_HUMANIsoform 2 of Protein phosphatase 1 regulatory subunit 14A OS = Homosapiens GN = PPP1R14A’, 120, 13479] [‘0.91’, 8, ‘sp|P08118-2|MSMB_HUMANIsoform PSP57 of Beta-microseminoprotein OS = Homo sapiens GN = MSMB’,77, 8778] [‘0.90’, 53, ‘sp|Q9UK58-1|CCNL1_HUMAN Isoform 1 of Cyclin-L1OS = Homo sapiens GN = CCNL1’, 526, 59633] [‘0.90’, 40,‘sp|Q03924-1|ZN117_HUMAN Isoform 1 of Zinc finger protein 117 OS = Homosapiens GN = ZNF117’, 383, 45066] [‘0.90’, 27, ‘sp|Q9BXY4-1|RSPO3_HUMANIsoform 1 of R-spondin-3 OS = Homo sapiens GN = RSPO3’, 272, 30928][‘0.90’, 16, ‘sp|Q86SG4-3|DPCA2_HUMAN Isoform 3 of Dresden prostatecarcinoma protein 2 OS = Homo sapiens GN = C15orf21’, 150, 17975][‘0.90’, 13, ‘sp|P47902-2|CDX1_HUMAN Isoform 2 of Homeobox protein CDX-1OS = Homo sapiens GN = CDX1’, 130, 14660] [‘0.89’, 44,‘sp|Q9NXE8-1|CCD49_HUMAN Isoform 1 of Coiled-coil domain-containingprotein 49 OS = Homo sapiens GN = CCDC49’, 425, 49647] [‘0.89’, 44,‘sp|Q03924-2|ZN117_HUMAN Isoform 2 of Zinc finger protein 117 OS = Homosapiens GN = ZNF117’, 427, 50051] [‘0.89’, 40, ‘sp|Q147U1-2|ZN846_HUMANIsoform 2 of Zinc finger protein 846 OS = Homo sapiens GN = ZNF846’,404, 45838] [‘0.89’, 29, ‘sp|Q9BXY4-2|RSPO3_HUMAN Isoform 2 ofR-spondin-3 OS = Homo sapiens GN = RSPO3’, 292, 33233] [‘0.89’, 20,‘sp|Q5T4W7-3|ARTN_HUMAN Isoform 3 of Artemin OS = Homo sapiens GN =ARTN’, 228, 23616] [‘0.89’, 18, ‘sp|Q6UXX9-3|RSPO2_HUMAN Isoform 3 ofR-spondin-2 OS = Homo sapiens GN = RSPO2’, 179, 20972] [‘0.89’, 13,‘sp|Q7Z422-2|CA144_HUMAN Isoform 2 of UPF0485 protein C1orf144 OS = Homosapiens GN = C1orf144’, 132, 14604] [‘0.89’, 9, ‘sp|Q8NFV4-3|ABHDB_HUMANIsoform 3 of Abhydrolase domain-containing protein 11 OS = Homo sapiensGN = ABHD11’, 97, 10361] [‘0.89’, 8, ‘sp|P48061-2|SDF1_HUMAN IsoformAlpha of Stromal cell-derived factor 1 OS = Homo sapiens GN = CXCL12’,89, 10103] [‘0.88’, 15, ‘sp|Q92466-3|DDB2_HUMAN Isoform D2 of DNAdamage-binding protein 2 OS = Homo sapiens GN = DDB2’, 156, 17434][‘0.88’, 8, ‘sp|Q9HD64-2|GAGD2_HUMAN Isoform B of G antigen family Dmember 2 OS = Homo sapiens GN = XAGE1’, 81, 9077] [‘0.88’, 7,‘sp|Q9BZJ0-5|CRNL1_HUMAN Isoform 5 of Crooked neck-like protein 1 OS =Homo sapiens GN = CRNKL1’, 74, 7946] [‘0.88’, 6, ‘sp|Q8TC05-3|MDM1_HUMANIsoform 3 of Nuclear protein MDM1 OS = Homo sapiens GN = MDM1’, 69,7926] [‘0.87’, 74, ‘sp|Q9NYF8-4|BCLF1_HUMAN Isoform 4 ofBcl-2-associated transcription factor 1 OS = Homo sapiens GN = BCLAF1’,747, 85937] [‘0.87’, 67, ‘sp|Q8NDQ6-1|ZN540_HUMAN Isoform 1 of Zincfinger protein 540 OS = Homo sapiens GN = ZNF540’, 660, 77093] [‘0.87’,52, ‘sp|Q03936-2|ZNF92_HUMAN Isoform 2 of Zinc finger protein 92 OS =Homo sapiens GN = ZNF92’, 517, 60209] [‘0.87’, 44,‘sp|Q8NEP9-3|ZN555_HUMAN Isoform 3 of Zinc finger protein 555 OS = Homosapiens GN = ZNF555’, 440, 51594] [‘0.87’, 25, ‘sp|P22090|RS4Y1_HUMAN40S ribosomal protein S4, Y isoform 1 OS = Homo sapiens GN = RPS4Y1’,263, 29455] [‘0.87’, 20, ‘sp|P55075-2|FGF8_HUMAN Isoform FGF-8A ofFibroblast growth factor 8 OS = Homo sapiens GN = FGF8’, 204, 23522][‘0.87’, 20, ‘sp|P12272-3|PTHR_HUMAN Isoform 3 of Parathyroidhormone-related protein OS = Homo sapiens GN = PTHLH’, 209, 23942][‘0.87’, 16, ‘sp|Q7Z7F7-2|RM55_HUMAN Isoform 2 of 39S ribosomal proteinL55, mitochondrial OS = Homo sapiens GN = MRPL55’, 164, 18902] [‘0.87’,12, ‘sp|P10747-4|CD28_HUMAN Isoform 4 of T-cell-specific surfaceglycoprotein CD28 OS = Homo sapiens GN = CD28’, 123, 14013] [‘0.86’, 33,‘sp|Q8N8C0-2|ZN781_HUMAN Isoform 2 of Zinc finger protein 781 OS = Homosapiens GN = ZNF781’, 327, 38274] [‘0.86’, 29, ‘sp|Q15973-1|ZN124_HUMANIsoform 3 of Zinc finger protein 124 OS = Homo sapiens GN = ZNF124’,296, 33852] [‘0.86’, 23, ‘sp|Q9H0A6-4|RNF32_HUMAN Isoform 4 of RINGfinger protein 32 OS = Homo sapiens GN = RNF32’, 235, 27130] [‘0.86’,21, ‘sp|Q8IWN7-2|RP1L1_HUMAN Isoform 2 of Retinitis pigmentosa 1-like 1protein OS = Homo sapiens GN = RP1L1’, 222, 24854] [‘0.86’, 20,‘sp|Q6PI47-3|KCD18_HUMAN Isoform 3 of BTB/POZ domain-containing proteinKCTD18 OS = Homo sapiens GN = KCTD18’, 221, 23414] [‘0.86’, 18,‘sp|O75494-4|FUSIP_HUMAN Isoform 4 of FUS-interactingserine-arginine-rich protein 1 OS = Homo sapiens GN = FUSIP1’, 173,21000] [‘0.86’, 13, ‘sp|P10747-3|CD28_HUMAN Isoform 3 of T-cell-specificsurface glycoprotein CD28 OS = Homo sapiens GN = CD28’, 136, 15369][‘0.86’, 7, ‘sp|P16157-20|ANK1_HUMAN Isoform Mu20 of Ankyrin-1 OS = Homosapiens GN = ANK1’, 74, 8374] [‘0.85’, 45, ‘sp|Q68DY1-2|ZN626_HUMANIsoform 2 of Zinc finger protein 626 OS = Homo sapiens GN = ZNF626’,464, 53889] [‘0.85’, 21, ‘sp|O60258-1|FGF17_HUMAN Isoform 1 ofFibroblast growth factor 17 OS = Homo sapiens GN = FGF17’, 216, 24891][‘0.85’, 17, ‘sp|P82912-1|RT11_HUMAN Isoform 1 of 28S ribosomal proteinS11, mitochondrial OS = Homo sapiens GN = MRPS11’, 194, 20615] [‘0.85’,13, ‘sp|Q9BWV2-3|SPAT9_HUMAN Isoform 3 of Spermatogenesis-associatedprotein 9 OS = Homo sapiens GN = SPATA9’, 135, 15275] [‘0.85’, 12,‘sp|Q9Y5P2-1|CSAG2_HUMAN Isoform 1 of Chondrosarcoma-associated gene2/3A protein OS = Homo sapiens GN = CSAG2’, 127, 14429] [‘0.85’, 10,‘sp|Q6RVD6-1|SPAT8_HUMAN Isoform 1 of Spermatogenesis-associated protein8 OS = Homo sapiens GN = SPATA8’, 105, 11727] [‘0.84’, 46,‘sp|Q3SXZ3-1|ZN718_HUMAN Isoform 1 of Zinc finger protein 718 OS = Homosapiens GN = ZNF718’, 478, 55404] [‘0.84’, 36, ‘sp|Q3SY52-3|ZIK1_HUMANIsoform 3 of Zinc finger protein interacting with ribonucleoprotein K OS= Homo sapiens GN = ZIK1’, 384, 43717] [‘0.84’, 24,‘sp|Q9BU76-1|MMTA2_HUMAN Isoform 1 of Multiple myeloma tumor-associatedprotein 2 OS = Homo sapiens GN = MMTAG2’, 263, 29411] [‘0.84’, 24,‘sp|Q8TD47|RS4Y2_HUMAN 40S ribosomal protein S4, Y isoform 2 OS = Homosapiens GN = RPS4Y2’, 263, 29295] [‘0.84’, 20, ‘sp|Q96CX3-2|ZN501_HUMANIsoform 2 of Zinc finger protein 501 OS = Homo sapiens GN = ZNF501’,215, 24880] [‘0.84’, 20, ‘sp|Q147U1-3|ZN846_HUMAN Isoform 3 of Zincfinger protein 846 OS = Homo sapiens GN = ZNF846’, 210, 24075] [‘0.84’,9, ‘sp|P56134-1|ATPK_HUMAN Isoform 1 of ATP synthase subunit f,mitochondrial OS = Homo sapiens GN = ATP5J2’, 94, 10917] [‘0.83’, 48,‘sp|Q96S94-1|CCNL2_HUMAN Isoform 1 of Cyclin-L2 OS = Homo sapiens GN =CCNL2’, 520, 58147] [‘0.83’, 27, ‘sp|Q9NWB6-2|ARGL1_HUMAN Isoform 2 ofArginine and glutamate-rich protein 1 OS = Homo sapiens GN = ARGLU1’,273, 32885] [‘0.83’, 24, ‘sp|P62701|RS4X_HUMAN 40S ribosomal protein S4,X isoform OS = Homo sapiens GN = RPS4X’, 263, 29597] [‘0.83’, 23,‘sp|Q6UXX9-1|RSPO2_HUMAN Isoform 1 of R-spondin-2 OS = Homo sapiens GN =RSPO2’, 243, 28314] [‘0.83’, 20, ‘sp|P55075-3|FGF8_HUMAN Isoform FGF-8Bof Fibroblast growth factor 8 OS = Homo sapiens GN = FGF8’, 215, 24711][‘0.83’, 12, ‘sp|Q8N3H0-1|F19A2_HUMAN Isoform 1 of Protein FAM19A2 OS =Homo sapiens GN = FAM19A2’, 131, 14620] [‘0.83’, 12,‘sp|Q6N063-3|OGFD2_HUMAN Isoform 3 of 2-oxoglutarate and iron-dependentoxygenase domain-containing protein 2 OS = Homo sapiens GN = OGFOD2’,129, 14734] [‘0.83’, 9, ‘sp|Q56VL3-2|OCAD2_HUMAN Isoform 2 of OCIAdomain-containing protein 2 OS = Homo sapiens GN = OCIAD2’, 99, 11029][‘0.82’, 34, ‘sp|Q8N8C0-1|ZN781_HUMAN Isoform 1 of Zinc finger protein781 OS = Homo sapiens GN = ZNF781’, 355, 41526] [‘0.82’, 20,‘sp|Q5T4W7-2|ARTN_HUMAN Isoform 2 of Artemin OS = Homo sapiens GN =ARTN’, 237, 24471] [‘0.82’, 17, ‘sp|Q9NY12-2|NOLA1_HUMAN Isoform 2 ofH/ACA ribonucleoprotein complex subunit 1 OS = Homo sapiens GN = NOLA1’,199, 20834] [‘0.81’, 37, ‘sp|Q96SQ7-2|ATOH8_HUMAN Isoform 2 of Proteinatonal homolog 8 OS = Homo sapiens GN = ATOH8’, 416, 45785] [‘0.81’, 22,‘sp|Q9NP64-1|NO40_HUMAN Isoform 1 of Nucleolar protein of 40 kDa OS =Homo sapiens GN = ZCCHC17’, 241, 27569] [‘0.81’, 22,‘sp|Q92913-1|FGF13_HUMAN Isoform 1A of Fibroblast growth factor 13 OS =Homo sapiens GN = FGF13’, 245, 27563] [‘0.81’, 21,‘sp|P55075-1|FGF8_HUMAN Isoform FGF-8E of Fibroblast growth factor 8 OS= Homo sapiens GN = FGF8’, 233, 26525] [‘0.81’, 18,‘sp|O75494-3|FUSIP_HUMAN Isoform 3 of FUS-interactingserine-arginine-rich protein 1 OS = Homo sapiens GN = FUSIP1’, 183,22222] [‘0.81’, 9, ‘sp|Q7L592-3|CB056_HUMAN Isoform 3 of UPF0511 proteinC2orf56, mitochondrial OS = Homo sapiens GN = C2orf56’, 99, 11289][‘0.81’, 7, ‘sp|Q6PDA7-3|SG11A_HUMAN Isoform 3 of Sperm-associatedantigen 11A OS = Homo sapiens GN = SPAG11A’, 82, 9075] [‘0.80’, 72,‘sp|O14746-2|TERT_HUMAN Isoform 2 of Telomerase reverse transcriptase OS= Homo sapiens GN = TERT’, 807, 90225] [‘0.80’, 54,‘sp|Q86YE8-4|ZN573_HUMAN Isoform 4 of Zinc finger protein 573 OS = Homosapiens GN = ZNF573’, 578, 67865] [‘0.80’, 30, ‘sp|O95218-1|ZRAB2_HUMANIsoform 1 of Zinc finger Ran-binding domain-containing protein 2 OS =Homo sapiens GN = ZRANB2’, 330, 37404] [‘0.80’, 24,‘sp|Q96CX3-1|ZN501_HUMAN Isoform 1 of Zinc finger protein 501 OS = Homosapiens GN = ZNF501’, 271, 31178] [‘0.80’, 22, ‘sp|Q92915-1|FGF14_HUMANIsoform 1 of Fibroblast growth factor 14 OS = Homo sapiens GN = FGF14’,247, 27701] [‘0.80’, 16, ‘sp|P82912-2|RT11_HUMAN Isoform 2 of 28Sribosomal protein S11, mitochondrial OS = Homo sapiens GN = MRPS11’,193, 20459]

Nucleic Acids

The present invention provides systems and methods for delivery ofnucleic acids to cells in vivo or in vitro. Such systems and methodstypically involve association of one or more nucleic acids withsupercharged proteins to form a complex, and delivery of the complex toone or more cells. In some embodiments, the nucleic acid may havetherapeutic activity. In some embodiments, delivery of the complex tocells involves administering a complex comprising supercharged proteinsassociated with a nucleic acid to a subject in need thereof. In someembodiments, a nucleic acid by itself may not be able to enter theinterior of a cell, but is able to enter the interior of a cell whencomplexed with a supercharged protein. In some embodiments, asupercharged protein is utilized to allow a nucleic acid to enter acell. Nucleic acids in accordance with the invention may themselves havetherapeutic activity or may direct expression of an RNA and/or proteinthat has therapeutic activity. Therapeutic activities of nucleic acidsare discussed in further detail below.

The term “nucleic acid,” in its broadest sense, includes any compoundand/or substance that is or can be incorporated into an oligonucleotidechain. Exemplary nucleic acids for use in accordance with the presentinvention include, but are not limited to, one or more of DNA, RNA,hybrids thereof, RNAi-inducing agents, RNAi agents, siRNAs, shRNAs,miRNAs, antisense RNAs, ribozymes, catalytic DNA, RNAs that inducetriple helix formation, aptamers, vectors, etc., described in furtherdetail below.

Nucleic acids for use in accordance with the invention may be preparedaccording to any available technique including, but not limited tochemical synthesis, enzymatic synthesis, enzymatic or chemical cleavageof a longer precursor, etc. Methods of synthesizing RNAs are known inthe art (see, e.g., Gait, M. J. (ed.) Oligonucleotide synthesis: apractical approach, Oxford [Oxfordshire], Washington, D.C.: IRL Press,1984; and Herdewijn, P. (ed.) Oligonucleotide synthesis: methods andapplications, Methods in Molecular Biology, v. 288 (Clifton, N.J.)Totowa, N.J.: Humana Press, 2005; both of which are incorporated hereinby reference).

Nucleic acids may comprise naturally occurring nucleosides, modifiednucleosides, naturally occurring nucleosides with hydrocarbon linkers(e.g., an alkylene) or a polyether linker (e.g., a PEG linker) insertedbetween one or more nucleosides, modified nucleosides with hydrocarbonor PEG linkers inserted between one or more nucleosides, or acombination of thereof. In some embodiments, nucleotides or modifiednucleotides can be replaced with a hydrocarbon linker or a polyetherlinker provided that the function of the nucleic acid is notsubstantially reduced by the substitution.

It will be appreciated by those of ordinary skill in the art thatnucleic acids in accordance with the present invention may comprisenucleotides entirely of the types found in naturally occurring nucleicacids, or may instead include one or more nucleotide analogs or have astructure that otherwise differs from that of a naturally occurringnucleic acid. U.S. Pat. Nos. 6,403,779; 6,399,754; 6,225,460; 6,127,533;6,031,086; 6,005,087; 5,977,089 (each of which is incorporated herein byreference); and references therein disclose a wide variety of specificnucleotide analogs and modifications that may be used. See Crooke, S.(ed.) Antisense Drug Technology: Principles, Strategies, andApplications (1^(st) ed), Marcel Dekker; ISBN: 0824705661; 1st edition(2001; incorporated herein by reference) and references therein. Forexample, 2′-modifications include halo, alkoxy and allyloxy groups. Insome embodiments, the 2′-OH group is replaced by a group selected fromH, OR, R, halo, SH, SR, NH₂, NHR, NR₂ or CN, wherein R is C₁-C₆ alkyl,alkenyl, or alkynyl, and halo is F, Cl, Br, or I. Examples of modifiedlinkages include phosphorothioate and 5′-N-phosphoramidite linkages.

Nucleic acids comprising a variety of different nucleotide analogs,modified backbones, or non-naturally occurring internucleoside linkagescan be utilized in accordance with the present invention. Nucleic acidsof the present invention may include natural nucleosides (i.e.,adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine,deoxythymidine, deoxyguanosine, and deoxycytidine) or modifiednucleosides. Examples of modified nucleotides include base modifiednucleoside (e.g., aracytidine, inosine, isoguanosine, nebularine,pseudouridine, 2,6-diaminopurine, 2-aminopurine, 2-thiothymidine,3-deaza-5-azacytidine, 2′-deoxyuridine, 3-nitorpyrrole, 4-methylindole,4-thiouridine, 4-thiothymidine, 2-aminoadenosine, 2-thiothymidine,2-thiouridine, 5-bromocytidine, 5-iodouridine, inosine, 6-azauridine,6-chloropurine, 7-deazaadenosine, 7-deazaguanosine, 8-azaadenosine,8-azidoadenosine, benzimidazole, M1-methyladenosine, pyrrolo-pyrimidine,2-amino-6-chloropurine, 3-methyl adenosine, 5-propynylcytidine,5-propynyluridine, 5-bromouridine, 5-fluorouridine, 5-methylcytidine,7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine,O(6)-methylguanine, and 2-thiocytidine), chemically or biologicallymodified bases (e.g., methylated bases), modified sugars (e.g.,2′-fluororibose, 2′-aminoribose, 2′-azidoribose, 2′-O-methylribose,L-enantiomeric nucleosides arabinose, and hexose), modified phosphategroups (e.g., phosphorothioates and 5′-N-phosphoramidite linkages), andcombinations thereof. Natural and modified nucleotide monomers for thechemical synthesis of nucleic acids are readily available. In somecases, nucleic acids comprising such modifications display improvedproperties relative to nucleic acids consisting only of naturallyoccurring nucleotides. In some embodiments, nucleic acid modificationsdescribed herein are utilized to reduce and/or prevent digestion bynucleases (e.g. exonucleases, endonucleases, etc.). For example, thestructure of a nucleic acid may be stabilized by including nucleotideanalogs at the 3′ end of one or both strands order to reduce digestion.

Modified nucleic acids need not be uniformly modified along the entirelength of the molecule. Different nucleotide modifications and/orbackbone structures may exist at various positions in the nucleic acid.One of ordinary skill in the art will appreciate that the nucleotideanalogs or other modification(s) may be located at any position(s) of anucleic acid such that the function of the nucleic acid is notsubstantially affected. To give but one example, modifications may belocated at any position of a nucleic acid targeting moiety such that theability of the nucleic acid targeting moiety to specifically bind to thetarget is not substantially affected. The modified region may be at the5′-end and/or the 3′-end of one or both strands. For example, modifiednucleic acid targeting moieties in which approximately 1 toapproximately 5 residues at the 5′ and/or 3′ end of either of bothstrands are nucleotide analogs and/or have a backbone modification havebeen employed. A modification may be a 5′ or 3′ terminal modification.One or both nucleic acid strands may comprise at least 50% unmodifiednucleotides, at least 80% unmodified nucleotides, at least 90%unmodified nucleotides, or 100% unmodified nucleotides.

Nucleic acids in accordance with the present invention may, for example,comprise a modification to a sugar, nucleoside, or internucleosidelinkage such as those described in U.S. Patent Publications2003/0175950, 2004/0192626, 2004/0092470, 2005/0020525, and2005/0032733; each of which is incorporated herein by reference. Thepresent invention encompasses the use of any nucleic acid having any oneor more of the modification described therein. For example, a number ofterminal conjugates, e.g., lipids such as cholesterol, lithocholic acid,aluric acid, or long alkyl branched chains have been reported to improvecellular uptake. Analogs and modifications may be tested using, e.g.,using any appropriate assay known in the art, for example, to selectthose that result in improved target gene silencing by an RNAi agent,etc. In some embodiments, nucleic acids in accordance with the presentinvention may comprise one or more non-natural nucleoside linkages. Insome embodiments, one or more internal nucleotides at the 3′-end,5′-end, or both 3′- and 5′-ends of the nucleic acid targeting moiety areinverted to yield a linkage such as a 3′-3′ linkage or a 5′-5′ linkage.

In some embodiments, nucleic acids in accordance with the presentinvention are not synthetic, but are naturally-occurring entities thathave been isolated from their natural environments.

RNAi Agents RNA Interference

In some embodiments, nucleic acids that can be associated withsupercharged proteins include agents that mediate RNA interference(RNAi). RNAi is a mechanism that inhibits expression of specific genes.RNAi typically inhibits gene expression at the level of translation, butcan function by inhibiting gene expression at the level oftranscription. RNAi targets include any RNA that might be present incells, including but not limited to, cellular transcripts, pathogentranscripts (e.g., from viruses, bacteria, fungi, etc.), transposons,vectors, etc.

The RNAi pathway is initiated by the enzyme dicer, which cleaves long,double-stranded RNA (dsRNA) molecules into short fragments of 20-25 basepairs, optionally with a few unpaired overhang bases on one or bothends. One of the two strands of each fragment, known as the guidestrand, is then incorporated into the RNA-induced silencing complex(RISC) and pairs with complementary sequences. The other strand isdegraded during RISC activation. The most well-studied outcome of thisrecognition event is post-transcriptional gene silencing. This occurswhen the guide strand specifically pairs with a target transcript andinduces degradation of the target transcript by argonaute, the catalyticcomponent of the RISC complex. Another outcome is epigenetic changes toa gene (e.g., histone modification and DNA methylation) affecting thedegree to which the gene is transcribed.

Introduction of long double-stranded RNA (e.g., greater than 30 bp) intomammalian cells results in systemic, nonspecific inhibition oftranslation due to activation of the interferon response. A breakthroughoccurred when it was found that this obstacle could be overcome by theuse of synthetic short RNAs (e.g., 19-25 bp) that can be eitherdelivered exogenously (Elbashir et al., 2001, Nature, 411:494;incorporated herein by reference) or expressed endogenously from RNApolymerase II or III promoters.

The phenomenon of RNAi is discussed in greater detail, for example, inthe following references, each of which is incorporated herein byreference: Elbashir et al., 2001, Genes Dev., 15:188; Fire et al., 1998,Nature, 391:806; Tabara et al., 1999, Cell, 99:123; Hammond et al.,Nature, 2000, 404:293; Zamore et al., 2000, Cell, 101:25; Chakraborty,2007, Curr. Drug Targets, 8:469; and Morris and Rossi, 2006, Gene Ther.,13:553.

As used herein, the term “RNAi agent” refers to an RNA, optionallyincluding one or more nucleotide analogs or modifications, having astructure characteristic of molecules that can mediate inhibition ofgene expression through an RNAi mechanism. Generally, an RNAi agentincludes a portion that is substantially complementary to a target RNA.In some embodiments, RNAi agents are at least partly double-stranded. Insome embodiments, RNAi agents are single-stranded. In some embodiments,exemplary RNAi agents can include short interfering RNA (siRNA), shorthairpin RNA (shRNA), and/or micro RNA (miRNA). In some embodiments, theterm “RNAi agent” may refer to any RNA, RNA derivative, and/or nucleicacid encoding an RNA that induces an RNAi effect (e.g., degradation oftarget RNA and/or inhibition of translation).

As used herein, the term “RNAi-inducing agent” encompasses any entitythat delivers, regulates, and/or modifies the activity of an RNAi agent.In some embodiments, RNAi-inducing agents may include vectors (otherthan naturally occurring molecules not modified by the hand of man)whose presence within a cell results in RNAi and leads to reducedexpression of a transcript to which the RNAi-inducing agent is targeted.In some embodiments, an RNAi-inducing agent is an “RNAi-inducingvector,” which refers to a vector whose presence within a cell resultsin production of one or more RNAs that self-hybridize or hybridize toeach other to form an RNAi agent (e.g. siRNA, shRNA, and/or miRNA). Invarious embodiments, this term encompasses plasmids, e.g., DNA vectors(whose sequence may comprise sequence elements derived from a virus), orviruses (other than naturally occurring viruses or plasmids that havenot been modified by the hand of man), whose presence within a cellresults in production of one or more RNAs that self-hybridize orhybridize to each other to form an RNAi agent. In general, the vectorcomprises a nucleic acid operably linked to expression signal(s) so thatone or more RNAs that hybridize or self-hybridize to form an RNAi agentare transcribed when the vector is present within a cell. Thus thevector provides a template for intracellular synthesis of the RNA orRNAs or precursors thereof. In some embodiments, RNAi-inducing agentsare compositions comprising RNAi agents and one or more pharmaceuticallyacceptable excipients and/or carriers. For the purposes of the presentinvention, any partly or fully double-stranded short RNA as describedherein, one strand of which binds to a target transcript and reduces itsexpression (i.e., reduces the level of the transcript and/or reducessynthesis of the polypeptide encoded by the transcript) is considered tobe an RNAi-inducing agent, regardless of whether it acts by triggeringdegradation, inhibiting translation, or by other means. In addition anyprecursor RNA structure that may be processed in vivo (i.e., within acell or organism) to generate such an RNAi-inducing agent is useful inthe present invention.

RNAi agents in accordance with the invention may target any portion of atranscript. In some embodiments, a target transcript is located within acoding sequence of a gene. In some embodiments, a target transcript islocated within non-coding sequence. In some embodiments, a targettranscript is located within an exon. In some embodiments, a targettranscript is located within an intron. In some embodiments, a targettranscript is located within a 5′ untranslated region (UTR) or 3′ UTR ofa gene. In some embodiments, a target transcript is located within anenhancer region. In some embodiments, a target transcript is locatedwithin a promoter.

For any particular gene target, design of RNAi agents and/orRNAi-inducing agents typically follows certain guidelines. In general,it is desirable to avoid sections of target transcript that may beshared with other transcripts whose degradation is not desired. In someembodiments, RNAi agents and/or RNAi-inducing entities targettranscripts and/or portions thereof that are highly conserved. In someembodiments, RNAi agents and/or RNAi-inducing entities targettranscripts and/or portions thereof that are not highly conserved.

siRNAs and shRNAs

As used herein, an “siRNA” refers to an RNAi agent comprising an RNAduplex (referred to herein as a “duplex region”) that is approximately19 base pairs (bp) in length and optionally further comprises one or twosingle-stranded overhangs. In some embodiments, an siRNA comprises aduplex region ranging from 15 bp to 29 bp in length and optionallyfurther comprising one or two single-stranded overhangs. An siRNA istypically formed from two RNA molecules (i.e., two strands) thathybridize together. One strand of an siRNA includes a portion thathybridizes with a target transcript. In some embodiments, siRNAs mediateinhibition of gene expression by causing degradation of targettranscripts.

As used herein, an “shRNA” refers to an RNAi agent comprising an RNAhaving at least two complementary portions hybridized or capable ofhybridizing to form a double-stranded (duplex) structure sufficientlylong to mediate RNAi (typically at least approximately 19 bp in length),and at least one single-stranded portion, typically ranging betweenapproximately 1 nucleotide (nt) and approximately 10 nt in length thatforms a loop. In some embodiments, an shRNA comprises a duplex portionranging from 15 bp to 29 bp in length and at least one single-strandedportion, typically ranging between approximately 1 nt and approximately10 nt in length that forms a loop. In some embodiments, thesingle-stranded portion is approximately 1 nt, approximately 2 nt,approximately 3 nt, approximately 4 nt, approximately 5 nt,approximately 6 nt, approximately 7 nt, approximately 8 nt,approximately 9 nt, or approximately 10 nt in length. In someembodiments, shRNAs are processed into siRNAs by cellular RNAi machinery(e.g., by Dicer). Thus, in some embodiments, shRNAs may be precursors ofsiRNAs. Regardless, siRNAs in general are capable of inhibitingexpression of a target RNA, similar to siRNAs. As used herein, the term“short RNAi agent” is used to refer to siRNAs and shRNAs, collectively.

As mentioned above, short RNAi agents typically include a base-pairedregion (“duplex region”) between approximately 15 nt and approximately29 nt long, e.g., approximately 19 nt long, and may optionally have oneor more free or looped ends. In some embodiments, short RNAi agents havea duplex region of about 15 nt, about 16 nt, about 17 nt, about 18 nt,about 19 nt, about 20 nt, about 21 nt, about 22 nt, about 23 nt, about24 nt, about 25 nt, about 26 nt, about 27 nt, about 28 nt, or about 29nt in length. However, it is not required that the administered agenthave this structure. For example, RNAi-inducing agents may comprise anystructure capable of being processed in vivo to the structure of a shortRNAi agent. In some embodiments, an RNAi-inducing agent is delivered toa cell, where it undergoes one or more processing steps before becominga functional short RNAi agent. In such cases, those of ordinary skill inthe art will appreciate that it is desirable for the RNAi-inducing agentto include sequences that may be necessary and/or helpful for itsprocessing.

In describing RNAi-inducing agents and/or short RNAi agents, it isconvenient to refer to an agent as having two strands. In general, thesequence of the duplex portion of one strand of an RNAi-inducing agentand/or short RNAi agent is substantially complementary to the targettranscript in this region. The sequence of the duplex portion of theother strand of the RNAi-inducing agent and/or short RNAi agent istypically substantially identical to the targeted portion of the targettranscript. The strand comprising the portion complementary to thetarget is referred to as the “antisense strand,” while the other strandis often referred to as the “sense strand.” The portion of the antisensestrand that is complementary to the target may be referred to as the“inhibitory region.”

RNAi-inducing agents and/or short RNAi agents typically include a region(the “duplex region”), one strand of which contains an inhibitory regionbetween 15 nt to 29 nt in length that is sufficiently complementary to aportion of the target transcript (the “target portion”), so that ahybrid (the “core region”) can form in vivo between this strand and thetarget transcript. The core region is understood not to includeoverhangs.

In some embodiments, short RNAi agents have an inhibitory region ofabout 15 nt, about 16 nt, about 17 nt, about 18 nt, about 19 nt, about20 nt, about 21 nt, about 22 nt, about 23 nt, about 24 nt, about 25 nt,about 26 nt, about 27 nt, about 28 nt, or about 29 nt in length. In someembodiments, short RNAi agents have an inhibitory region of about 19 ntin length. In some embodiments, hybridization of one strand of a shortRNAi agent to its target transcript yields a core region of about 15 nt,about 16 nt, about 17 nt, about 18 nt, about 19 nt, about 20 nt, about21 nt, about 22 nt, about 23 nt, about 24 nt, about 25 nt, about 26 nt,about 27 nt, about 28 nt, or about 29 nt in length. In some embodiments,hybridization of one strand of a short RNAi agent to its targettranscript yields a core region of about 19 nt in length.

Target transcripts are often cleaved near the center of the duplexregion. In some embodiments, target transcripts are cleaved at 11 nt or12 nt downstream of the first base pair of the duplex that forms betweenthe siRNA and target transcript (see, e.g., Elbashir et al., 2001, GenesDev., 15:188; incorporated herein by reference).

In some embodiments, siRNAs comprise 3′-overhangs at one or both ends ofthe duplex region. In some embodiments, an shRNA comprises a 3′ overhangat its free end. In some embodiments, siRNAs comprise a singlenucleotide 3′-overhang. In some embodiments, siRNAs comprise a3′-overhang of 2 nt. In some embodiments, siRNAs comprise a 3′-overhangof 1 nt. Overhangs, if present, may, but need not be, complementary tothe target transcript. siRNAs with 2 nt-3 nt overhangs on their 3′-endsare frequently efficient in reducing target transcript levels thansiRNAs with blunt ends.

Any desired sequence (e.g., UU) may simply be appended to the 3′ ends ofantisense and/or sense core regions to generate 3′-overhangs. Ingeneral, overhangs containing one or more pyrimidines, usually U, T, ordT, are employed. When synthesizing RNAi-inducing agents, it may be moreconvenient to use T rather than U in the overhang(s). Use of dT ratherthan T may confer increased stability.

In some embodiments, the inhibitory region of a short RNAi agent is 100%complementary to a region of a target transcript. However, in someembodiments, the inhibitory region of a short RNAi agent is less than100% complementary to a region of a target transcript. The inhibitoryregion need only be sufficiently complementary to a target transcriptsuch that hybridization can occur, e.g., under physiological conditionsin a cell and/or in an in vitro system that supports RNAi (e.g., aDrosophila extract system).

One of ordinary skill in the art will appreciate that short RNAi agentduplexes may tolerate mismatches and/or bulges, particularly mismatcheswithin the central region of the duplex, while still leading toeffective silencing. One of skill in the art will also recognize that itmay be desirable to avoid mismatches in the central portion of the shortRNAi agent/target transcript core region (see, e.g., Elbashir et al.,EMBO J. 20:6877, 2001). For example, the 3′ nucleotides of the antisensestrand of the siRNA often do not contribute significantly to specificityof the target recognition and may be less critical for target cleavage.

In some embodiments, short RNAi agents having duplex regions thatexhibit one or more mismatches typically have no more than 6 totalmismatches. In some embodiments, short RNAi agents have 1, 2, 3, 4, 5,or 6 total mismatches in their duplex regions. In some embodiments, theduplex regions have stretches of perfect complementarity that are atleast 5 nt in length (e.g., 6, 7, or more nt). In some embodiments, nomore than 20% of the nucleotides within a duplex region are mismatched.In some embodiments, no more than 15% of the nucleotides within a duplexregion are mismatched. In some embodiments, no more than 10% of thenucleotides within a duplex region are mismatched. In some embodiments,no more than 5% of the nucleotides within a duplex region aremismatched. In some embodiments, none of the nucleotides within a duplexregion are mismatched. Duplex regions may include two stretches ofperfect complementarity separated by a region of mismatch. In someembodiments, there are multiple areas of mismatch.

In some embodiments, core regions (e.g., formed by hybridization of onestrand of a short RNAi agent with a target transcript), which exhibitone or more mismatches typically, have no more than 6 total mismatches.In some embodiments, core regions have 1, 2, 3, 4, 5, or 6 totalmismatches. In some embodiments, core regions comprise stretches ofperfect complementarity that are at least 5 nt in length (e.g., 6, 7, ormore nt). In some embodiments, no more than 20% of the nucleotideswithin a core region are mismatched. In some embodiments, no more than15% of the nucleotides within a core region are mismatched. In someembodiments, no more than 10% of the nucleotides within a core regionare mismatched. In some embodiments, no more than 5% of the nucleotideswithin a core region are mismatched. In some embodiments, none of thenucleotides within a core region are mismatched. Core regions mayinclude two stretches of perfect complementarity separated by a regionof mismatch. In some embodiments, there are multiple areas of mismatch.

In some embodiments, one or both strands of a short RNAi agent mayinclude one or more “extra” nucleotides that form a “bulge.” One or morebulges (e.g., 5 nt-10 nt long) may be present.

In some embodiments, short RNAi agents can be designed and/or predictedusing one or more of a large number of available algorithms. To give buta few examples, the following resources can be utilized to design and/orpredict RNAi agents: algorithms found at Alnylum Online, DharmaconOnline, OligoEngine Online, Molecula Online, Ambion Online, BioPredsiOnline, RNAi Web Online, Chang Bioscience Online, Invitrogen Online,LentiWeb Online GenScript Online, Protocol Online; Reynolds et al.,2004, Nat. Biotechnol., 22:326; Naito et al., 2006, Nucleic Acids Res.,34:W448; Li et al., 2007, RNA, 13:1765; Yiu et al., 2005,Bioinformatics, 21:144; and Jia et al., 2006, BMC Bioinformatics, 7:271; each of which is incorporated herein by reference).

micro RNAs

micro RNAs (miRNAs) are genomically encoded non-coding RNAs of about21-23 nucleotides in length that help regulate gene expression,particularly during development (see, e.g., Bartel, 2004, Cell, 116:281;Novina and Sharp, 2004, Nature, 430:161; and U.S. Patent Publication2005/0059005; also reviewed in Wang and Li, 2007, Front. Biosci.,12:3975; and Zhao, 2007, Trends Biochem. Sci., 32:189; each of which areincorporated herein by reference). The phenomenon of RNA interference,broadly defined, includes the endogenously induced gene silencingeffects of miRNAs as well as silencing triggered by foreign dsRNA.Mature miRNAs are structurally similar to siRNAs produced from exogenousdsRNA, but before reaching maturity, miRNAs first undergo extensivepost-transcriptional modification. An miRNA is typically expressed froma much longer RNA-coding gene as a primary transcript known as apri-miRNA, which is processed in the cell nucleus to a 70-nucleotidestem-loop structure called a pre-miRNA by the microprocessor complex.This complex consists of an RNase III enzyme called Drosha and adsRNA-binding protein Pasha. The dsRNA portion of this pre-miRNA isbound and cleaved by dicer to produce the mature miRNA molecule that canbe integrated into the RISC complex; thus, miRNA and siRNA share thesame cellular machinery downstream of their initial processing (Gregoryet al., 2006, Meth. Mol. Biol., 342:33; incorporated herein byreference). In general, miRNAs are not perfectly complementary to theirtarget transcripts.

In some embodiments, miRNAs can range between 18 nt-26 nt in length.Typically, miRNAs are single-stranded. However, in some embodiments,miRNAs may be at least partially double-stranded. In certainembodiments, miRNAs may comprise an RNA duplex (referred to herein as a“duplex region”) and may optionally further comprises one or twosingle-stranded overhangs. In some embodiments, an RNAi agent comprisesa duplex region ranging from 15 bp to 29 bp in length and optionallyfurther comprising one to three single-stranded overhangs. An miRNA maybe formed from two RNA molecules that hybridize together, or mayalternatively be generated from a single RNA molecule that includes aself-hybridizing portion. The duplex portion of an miRNA usually, butdoes not necessarily, comprise one or more bulges consisting of one ormore unpaired nucleotides. One strand of an miRNA includes a portionthat hybridizes with a target RNA. In certain embodiments, one strand ofthe miRNA is not precisely complementary with a region of the targetRNA, meaning that the miRNA hybridizes to the target RNA with one ormore mismatches. In some embodiments, one strand of the miRNA isprecisely complementary with a region of the target RNA, meaning thatthe miRNA hybridizes to the target RNA with no mismatches. Typically,miRNAs are thought to mediate inhibition of gene expression byinhibiting translation of target transcripts. However, in someembodiments, miRNAs may mediate inhibition of gene expression by causingdegradation of target transcripts.

In some embodiments, miRNAs have a duplex region of about 15 nt, about16 nt, about 17 nt, about 18 nt, about 19 nt, about 20 nt, about 21 nt,about 22 nt, about 23 nt, about 24 nt, about 25 nt, about 26 nt, about27 nt, about 28 nt, or about 29 nt in length. In some embodiments,miRNAs have an inhibitory region of about 15 nt, about 16 nt, about 17nt, about 18 nt, about 19 nt, about 20 nt, about 21 nt, about 22 nt,about 23 nt, about 24 nt, about 25 nt, about 26 nt, about 27 nt, about28 nt, or about 29 nt in length.

In some embodiments, miRNAs have duplex regions that exhibit one or moremismatches in their duplex regions. In some embodiments, miRNAs haveduplex regions that exhibit 1, 2, 3, 4, 5, 6, 7, 8, or 9 totalmismatches in their duplex regions. In some embodiments, the duplexregions have stretches of perfect complementarity that are 1, 2, 3, 4,5, 6, 7, 8, or 9 nt in length. Duplex regions may include two stretchesof perfect complementarity separated by a region of mismatch. In someembodiments, there are multiple areas of mismatch. In some embodiments,about 50% of the nucleotides within a duplex region are mismatched. Insome embodiments, about 40% of the nucleotides within a duplex regionare mismatched. In some embodiments, about 30% of the nucleotides withina duplex region are mismatched. In some embodiments, about 20% of thenucleotides within a duplex region are mismatched. In some embodiments,about 10% of the nucleotides within a duplex region are mismatched. Insome embodiments, about 5% of the nucleotides within a duplex region aremismatched.

In some embodiments, core regions (e.g., formed by hybridization of onestrand of an miRNA with a target transcript) have 1, 2, 3, 4, 5, 6, 7,8, or 9 total mismatches. In some embodiments, core regions comprisestretches of perfect complementarity that are 1, 2, 3, 4, 5, 6, 7, 8, or9 nt in length. Core regions may include two stretches of perfectcomplementarity separated by a region of mismatch. In some embodiments,there are multiple areas of mismatch. In some embodiments, there aremultiple areas of mismatch. In some embodiments, about 50% of thenucleotides within a core region are mismatched. In some embodiments,about 40% of the nucleotides within a core region are mismatched. Insome embodiments, about 30% of the nucleotides within a core region aremismatched. In some embodiments, about 20% of the nucleotides within acore region are mismatched. In some embodiments, about 10% of thenucleotides within a core region are mismatched. In some embodiments,about 5% of the nucleotides within a core region are mismatched.

In some embodiments, one or both strands of an miRNA may include one ormore “extra” nucleotides that form a “bulge.” One or more bulges (e.g.,5 nt-10 nt long) may be present.

In some embodiments, short RNAi agents can be designed and/or predictedusing one or more of a large number of available algorithms. To give buta few examples, the following resources can be utilized to design and/orpredict RNAi agents: algorithms at PicTar Online, Protocol Online, EMBLOnline; Rehmsmeier et al., 2004, RNA, 10:1507; Kim et al., 2006, BMCBioinformatics, 7:411; Lewis et al., 2003, Cell, 115:787; and Krek etal., 2005, Nat. Genet., 37:495; each of which is incorporated herein byreference.

Antisense RNAs

In some embodiments, nucleic acids that can be associated withsupercharged proteins include antisense RNAs. Antisense RNAs aretypically RNA strands of various lengths that bind to target transcriptsand block their translation (e.g., either through degradation of mRNAand/or by sterically blocking critical steps of the translationprocess).

Antisense RNAs exhibit many of the same characteristics of RNAi agentsdescribed above. For example, antisense RNAs exhibit sufficientcomplementarity to a target transcript to allow hybridization of theantisense RNA to the target transcript. Mismatches are tolerated, asdescribed above for RNAi agents, as long as hybridization to the targetcan still occur. In general, antisense RNAs are longer than short RNAiagents, and can be of any length, as long as hybridization can stilloccur. In some embodiments, antisense RNAs are about 20 nt, about 30 nt,about 40 nt, about 50 nt, about 75 nt, about 100 nt, about 150 nt, about200 nt, about 250 nt, about 500 nt, or longer. In some embodiments,antisense RNAs comprise an inhibitory region that hybridizes with atarget transcript of about 20 nt, about 30 nt, about 40 nt, about 50 nt,about 75 nt, about 100 nt, about 150 nt, about 200 nt, about 250 nt,about 500 nt, or longer.

Ribozymes

In some embodiments, nucleic acids that can be associated withsupercharged proteins include ribozymes. A ribozyme (from ribonucleicacid enzyme; also called RNA enzyme or catalytic RNA) is an RNA moleculethat catalyzes a chemical reaction. Many natural ribozymes catalyzeeither the hydrolysis of one of their own phosphodiester bonds, or thehydrolysis of bonds in other RNAs, but they have also been found tocatalyze the aminotransferase activity of the ribosome.

In some embodiments, ribozymes used for gene-knockdown applications havea catalytic domain that is flanked by sequences complementary to atarget transcript. The mechanism of gene silencing generally involvesbinding of a ribozyme to a target transcript via Watson-Crick basepairing, followed by cleavage of the phosphodiester backbone of thetarget transcript by transesterification (Kurreck, 2003, Eur. J.Biochem., 270:1628; Sun et al., 2000, Pharmacol. Rev., 52:325; Doudnaand Cech, 2002, Nature, 418:222; Goodchild, 2000, Curr. Opin. Mol.Ther., 2:272; Michienzi and Rossi, 2001, Methods Enzymol., 341:581; eachof which is incorporated herein by reference). Once the targettranscript is destroyed, ribozymes dissociate and subsequently canrepeat cleavage on additional substrates. In some embodiments, aribozyme to be associated with a supercharged protein is a hammerheadribozyme. Hammerhead ribozymes were first isolated from viroid RNAs thatundergo site-specific self-cleavage as part of their replicationprocess.

In some embodiments, ribozymes are naturally-occurring ribozymes,including but not limited to, peptidyl transferase 23S rRNA, RNase P,Group I and Group II introns, GIR1 branching ribozyme, leadzyme, hairpinribozyme, hammerhead ribozyme, HDV ribozyme, mammalian CPEB3 ribozyme,VS ribozyme, glmS ribozyme, and CoTC ribozyme.

In some embodiments, ribozymes are artificial ribozymes. For example,artificially-produced self-cleaving RNAs that have good enzymaticactivity have been produced. Tang and Breaker (1997, Proc. Natl. Acad.Sci., 97:5784; incorporated herein by reference) isolated self-cleavingRNAs by in vitro selection of RNAs originating from random-sequenceRNAs. Some of the synthetic ribozymes that were produced had novelstructures, while some were similar to the naturally occurringhammerhead ribozyme.

In some embodiments, techniques used to discover artificial ribozymesinvolve Darwinian evolution. This approach takes advantage of RNA's dualnature as both a catalyst and an informational polymer, thereby allowingan investigator to produce vast populations of RNA catalysts usingpolymerase enzymes. Ribozymes are mutated by reverse transcribing themwith reverse transcriptase into various cDNA and amplified withmutagenic PCR. The selection parameters in these experiments oftendiffer. To give but one example, an approach for selecting a ligaseribozyme might involve using biotin tags, which are covalently linked toa substrate. If a candidate ribozyme possesses the desired ligaseactivity, a streptavidin matrix can be used to recover the activemolecules.

Deoxyribozymes

In some embodiments, nucleic acids that can be associated withsupercharged proteins include catalyic DNAs (“deoxyribozymes”).Deoxyribozymes bind to RNA substrates, typically via Watson-Crick basepairing, and site-specifically cleave target transcripts, similarly toribozymes. Deoxyribozymes molecules have been produced by in vitroevolution since no natural examples of DNA enzymes are known. Twodifferent catalytic motifs, with different cleavage site specificities,have been identified. Deoxyribozymes have been produced with differentcleavage specificities, allowing researchers to target all possibledinucleotide sequences.

Aptamers

In some embodiments, nucleic acids that can be associated withsupercharged proteins include aptamers. Aptamers are oligonucleic acidmolecules that bind specific target molecules. Aptamers may beengineered through repeated rounds of in vitro selection (e.g., viasystematic evolution of ligands by exponential enrichment, “SELEX”) tobind to various molecular targets such as small molecules, proteins,nucleic acids, cells, tissues, and/or organisms. Aptamers typically bindto their targets due to the three-dimensional structure of the aptamer.Aptamers generally do not bind to their targets via traditionalWatson-Crick base pairing.

The first aptamer-based drug approved by the U.S. Food and DrugAdministration (FDA) in treatment for age-related macular degeneration(AMD), called MACUGEN® (OSI Pharmaceuticals). In addition, ARC1779(Archemix, Cambridge, Mass.) is a potent, selective, first-in-classantagonist of von Willebrand Factor (vWF) and is being evaluated inpatients diagnosed with acute coronary syndrome (ACS) who are undergoingpercutaneous coronary intervention (PCI).

In general, unmodified aptamers are usually cleared rapidly from thebloodstream, with a half-life of minutes to hours. This is presumablydue to nuclease degradation and clearance from the body by the kidneys,which occur because aptamers tend to have low molecular weights.Unmodified aptamers may be particularly suited for treating transientconditions (e.g., blood clotting), and/or for treating organs wherelocal delivery is possible (e.g., the eye, skin, etc.). Rapid clearancecan be desirable in applications such as in vivo diagnostic imaging. Forexample, a tenascin-binding aptamer (Schering AG) can be utilized forcancer imaging. In some embodiments, aptamers with increased half-livesare desirable. Certain modifications (e.g., 2′-fluorine-substitutedpyrimidines, polyethylene glycol (PEG) linkage, etc.) may increase thehalf-life of aptamers.

RNA that Induce Triple Helix Formation

In some embodiments, nucleic acids that can be associated withsupercharged proteins include RNAs that induce triple helix formation.In some embodiments, endogenous target gene expression may be reduced bytargeting deoxyribonucleotide sequences complementary to the regulatoryregion of the target gene (i.e., the target gene's promoter and/orenhancers) to form triple helical structures that prevent transcriptionof the target gene in target muscle cells in the body (see generally,Helene, 1991, Anticancer Drug Des. 6:569; Helene et al., 1992, Ann, N.Y.Acad. Sci. 660:27; and Maher, 1992, Bioassays 14:807).

Vectors

In some embodiments, nucleic acids that can be associated withsupercharged proteins include vectors. As used herein, “vector” refersto a nucleic acid molecule which can transport another nucleic acid towhich it has been linked. In some embodiment, vectors can achieveextra-chromosomal replication and/or expression of nucleic acids towhich they are linked in a host cell such as a eukaryotic and/orprokaryotic cell. Exemplary vectors include plasmids, cosmids, viruses,viral genomes, artificial chromosomes, bacterial artificial chromosomes,and/or yeast artificial chromosomes. In certain embodiments, vectorsinclude elements such as promoters, enhancers, ribosomal binding sites,etc.

In some embodiments, vectors are capable of directing the expression ofoperatively linked genes (“expression vectors”). In some embodiments,expression of the operatively linked gene may result in production of afunctional nucleic acid (e.g., RNAi agent, antisense RNA, aptamer,ribozyme, etc.). In some embodiments, expression of the operativelylinked gene may result in production of a protein (e.g., a therapeutic,diagnostic, and/or prophylactic protein). In some embodiments, atherapeutic protein is a protein-based drug (e.g., an antibody-baseddrug, a peptide-based drug, etc.). In some embodiments, a prophylacticprotein may be a protein antigen and/or antibody. In some embodiments, adiagnostic protein may be one that exhibits certain characteristicsbefore delivery to a cell by a supercharged protein, but exhibitsdetectably different characteristics after delivery.

In some embodiments, a vector is a viral vector. In some embodiments, avector is of bacterial origin. In some embodiments, a vector is offungal origin. In some embodiments, a vector is of eukaryotic origin. Insome embodiments, a vector is of prokaryotic origin. In someembodiments, a vector may be delivered to a cell via a superchargedprotein, where it subsequently replicates in vivo. In some embodiments,a vector may be delivered to a cell via a supercharged protein, where itis subsequently transcribed in vivo.

Labeled Nucleic Acids

In some embodiments, nucleic acids in accordance with the invention aretagged with a detectable label. Suitable labels that can be used inaccordance with the invention include, but are not limited to,fluorescent, chemiluminescent, phosphorescent, and/or radioactivelabels. In some embodiments, nucleic acids comprise at least onenucleotide that is attached to at least one fluorescent moiety (e.g.,fluorescein, rhodamine, coumarin, cyanine-3, cyanine-5, Alexa Fluor, andDyLight Fluor, etc.). Any fluorescent moiety that can be associated witha nucleic acid can be utilized in accordance with the invention. In someembodiments, nucleic acids comprise at least one radioactive nucleotide(e.g., a nucleotide containing ³²P or ³⁵S). In some embodiments, nucleicacids comprise at least one nucleotide that is attached to at least oneradioactive moiety.

Cellular Nucleic Acids Targeted by Delivered Nucleic Acids

In some embodiments, nucleic acids (e.g., siRNAs, shRNAs, miRNAs,antisense RNAs, ribozymes, etc.) to be delivered to cells usingsupercharged proteins are useful for targeting cellular nucleic acidsfor degradation. Any cellular nucleic acid can be targeted fordegradation. Exemplary cellular nucleic acids that can be targeted fordegradation include, but are not limited to, GAPDH, β-actin, β-tubulin,and c-myc.

Peptides and Proteins

The present invention provides systems and methods for delivery ofproteins or peptides to cells in vivo, ex vivo, or in vitro. Suchsystems and methods typically involve association of a peptide orprotein with supercharged proteins to form a complex, and delivery ofthe complex to a cell. In some embodiments, the protein or peptide to bedelivered by the supercharged protein has therapeutic activity. In someembodiments, delivery of the complex to a cell involves administering acomplex comprising a supercharged protein associated with a peptide orprotein to a subject in need thereof.

In some embodiments, a peptide or protein by itself may not be able toenter a cell, but is able to enter a cell when associated with asupercharged protein, for example, via a covalent bond or a non-covalentinteraction. In some embodiments, the complex includes a peptide orprotein that is covalently bound to a supercharged protein. In someembodiments, the complex includes a peptide or protein fused to asupercharged protein via a peptide bond, for example, via direct fusionor via a peptide linker as provided herein. In some embodiments, thecomplex includes a peptide or protein that is bound to a superchargedprotein by non-covalent interaction. In some embodiments, a superchargedprotein is utilized to allow a peptide or protein to enter a cell. Insome embodiments, the peptide or protein delivered to the cell in acomplex with a supercharged protein is separated from the superchargedprotein after delivery to the cell, for example by cleavage of a linkerpeptide by a cellular protease (e.g., an endosomal protease) or bydissociation of a peptide or protein from a supercharged protein in aspecific cellular microenvironment, for example the endosome. In someembodiments, peptides or proteins delivered to a cell by a system ormethod provided by this invention have therapeutic activity.

In some embodiments, a functional protein is delivered to a cell invivo, ex vivo, or in vitro by a system or method provided herein. Insome embodiments, a functional protein is a protein able to carry out abiological function within the target cell, for example, an enzyme ableto catalyze an enzymatic reaction in the target cell, a transcriptionfactor able to interact with the genome of a target cell and to activateor inhibit transcription of a target gene in the cell, a recombinaseable to interact with the genome of a target cell and to recombine itstarget sites, a nuclease able to bind and cut a nucleic acid moleculewithin a target cell, a binding partner of a cellular molecule able tobind that molecule in the target cell, or a substrate of an enzyme ofthe target cell able to interact with that enzyme in the target cell.

In some embodiments, a functional protein is associated with asupercharged protein and subsequently delivered to a cell in vivo, exvivo, or in vitro. A functional protein can be associated with asupercharged protein via a covalent bond, for example, a peptide bond, acarbon-carbon bond, or a disulfide bond, or via non-covalentinteraction. In some embodiments, a functional protein is produced thatis fused to a supercharged protein via a peptide bond, for example,directly or via a peptide linker as described herein. Methods for thegeneration and isolation of fusion proteins are well known to those ofskill in the art.

In some embodiments, a method for generating a fusion of a functionalprotein and a supercharged protein includes the generation of anexpression nucleic acid construct containing the coding sequences of thefunctional protein and the supercharged protein, as well as, optionally,a peptide linker, in frame, the expression of such a recombinant fusionprotein in a prokaryotic or eukaryotic cell in culture, the extractionand purification of the fusion protein of the fusion protein. In someembodiments, a nucleic acid construct is generated in the form of anexpression vector, for example, a vector suitable for propagation in abacterial host and for expression in a prokaryotic or eukaryotic cell.

In some embodiments, a vector suitable for fusion protein expression isgenerated by cloning of a nucleotide sequence coding for a functionalprotein to be delivered into a cloning vector including a nucleotidesequence coding for a supercharged protein under the control of aeukaryotic and/or a prokaryotic promoter, by a cloning approach thatresults in both coding sequences being in frame with each other. In someembodiments, the cloning vector includes a nucleotide sequence codingfor a peptide linker between a nucleotide sequence coding for asupercharged protein and a restriction site useful for inserting anucleotide sequence coding for a protein in frame with the linker andthe supercharged protein. In some embodiments, the cloning vectorfurther includes an additional sequence enhancing expression of a fusionprotein in a prokaryotic or eukaryotic cell or facilitating purificationof expressed fusion proteins from such cells, for example, a sequencestabilizing a transcript encoding the fusion protein, such as a poly-Asignal, a spliceable intron, a sequence encoding an in-frame peptide orprotein domain tag (e.g., an Arg-tag, calmodulin-binding peptide tag,cellulose-binding domain tag, DsbA tag, c-myc-tag, glutathioneS-transferase tag, FLAG-tag, HAT-tag, His-tag, maltose-binding proteintag, NusA tag, S-tag, SBP-tag, Strep-tag, or thioredoxin tag), or aselection marker or reporter cassette allowing for identification ofcells harboring and expressing the expression construct and/orquantifying the level of expression in such cells. Methods for cloningand expressing fusion proteins are well known to those in the art, see,for example Sambrook et al, Molecular Cloning: a laboratory manual,Volume 1-3, CSHL Press (1989); Gellissen et al., Production ofrecombinant proteins, Wiley-VCH, 2005.

In some embodiments, the protein is associated with a supercharged GFP,for example +36 GFP, for delivery to a target cell. While +36 GFP iscapable of delivering exogenous protein into a very high fraction oftreated cells, it is likely that only a portion of the delivered proteincan reach a given subcellular location. The importance of endosomaldisruption in the delivery of macromolecules has been previouslydemonstrated (Wadia et al., Nat. Med. 10, 310-315, 2004) and in someembodiments, additional steps to effect enhanced endosomal escape, asprovided herein or known in the art, are performed. Highly efficientprotein internalization, when coupled with effective endosomal release,has the potential to minimize the requisite doses of exogenous proteinagents, enhancing their potential as research tools and leads fortherapeutic development.

The widespread use of GFP fusion proteins in biological researchsuggests that fusions with +36 GFP may represent a fairly generalapproach to constructing cell-permeable protein reagents. The highsolubility and aggregation resistance of supercharged GFP (Lawrence etal., JACS 129, 10110-10112, 2007) may also facilitate the isolation ofsuch fusions. Based on the rapid and potent protein delivery propertiesof superpositively charged proteins as provided herein, for example of+36 GFP, and on the widespread use of PTDs such as Tat and polyarginine,our results collectively suggest the potential of superpositivelycharged proteins as a new platform for the growing number of proteindelivery applications.

In some embodiments, a fusion protein including a peptide or protein anda supercharged protein are administered to a target cell after isolationand/or purification. In some embodiments, cells expressing such a fusionprotein are collected, lysed, and the soluble fraction of the celllysate is administered to a target cell after removal of the insolublefraction by centrifugation or filtration. In some embodiments, cellsexpressing a fusion protein including a peptide or protein and asupercharged protein are collected, lysed and the fusion protein isisolated from the lysate, for example from the soluble fraction of thelysate. Protein isolation methods and technologies are well known tothose of skill in the art and include, for example, affinitychromatography or immunoprecipitation. The methods suitable forisolating and/or purifying a specific fusion protein will depend on thenature of the fusion protein. For example, a His-tagged fusion proteincan readily be isolated and purified via Ni or Co ion chromatography,while fusion proteins tagged with other peptides or domains or untaggedfusion proteins can be purified by other well established methods.

Proteins suitable for delivery to a target cell in vivo, ex vivo, or invitro, by a system or method provided herein will be apparent to thoseof skill in the art and include, for example, DNA-binding proteins, suchas transcription factors, histones, zinc-finger proteins, includingzinc-finger nucleases, cytoskeletal proteins, receptor proteins,chaperone proteins, intracellular ligands, epigenetic perturbators, suchas histone acetyltransferase or deacetylases, DNA methyltransferases,modulators of cellular signaling pathways, such as kinases andphosphatases, proteases targeting a specific cellular protein and, forexample, disrupting or activating a signaling pathway, and otherenzymes, such as oxidoreductases, transferases, hydrolases, lyases,isomerases, or ligases.

In some embodiments, a method or system provided herein is used todeliver a therapeutic protein to a cell. Examples of therapeuticproteins include, but are not limited to, a protein preventing orinhibiting the proliferation of a cell or cell population, such as atumor suppressor protein (e.g., p53, retinoblastoma protein, BRCA1,BRCA2, PTEN, APC, CD95, ST7, or ST14); a protein inducing cell death ina cell or cell population (e.g., p53, a proapoptotic member of the BCL-2family of proteins, or a caspase); a protein preventing or inhibitingmetastasis formation, such as a metastasis suppressor protein (e.g.,BRMS1, CRSP3, DRG1, KAI1, KISS1, NM23, or a TIMP-family protein); aprotein inducing proliferation of a cell or cell population, such as agrowth factor (e.g., a BMP-family growth factor, EGF, EPO, FGF, G-CSF,GM-CSF, a GDF-family growth factor, HGF, HDGF, IGF, PDGF, TPO, TGF-α,TGF-β, or VEGF); or a zinc finger nuclease targeting a specific sitewithin the genome of a cell.

Transcription Factors

In some embodiments, a transcription factor is delivered to a cell by asystem or method provided by aspects of this invention. In someembodiments, a transcription factor is delivered to a cell in an amountsufficient to activate or inhibit transcription of a target gene of thetranscription factor within the cell. In some embodiments, atranscription factor is delivered in an amount and over a time periodsufficient to effect a change in the phenotype of a target cell, forexample, a change in cellular function, or a change in developmentalpotential.

In some embodiments, the transcription factor delivered to a cell by asystem or method of this invention is a basic-helix-loop-helix factor,for example, a leucine zipper factor (ZIP, e.g., an AP-1 (-like)component, for example, c-Fos or c-Jun, a CREB or C/EBP-like factor, abZIP/PAR factor, a G-box binding factor, a ZIP factor); ahelix-loop-helix factor (bHLH, e.g., a ubiquitous (class A) factor, amyogenic transcription factor (e.g., MyoD), an achaete-scute factor, aTal/Twist/Atonal/Hen factor); a helix-loop-helix/leucine zipper factor(bHLH-ZIP, e.g., a ubiquitous bHLH-ZIP factor, such as a USF or SREBPfactor, a cell-cycle controlling factor, e.g. c-Myc); a NF-1 factor(e.g., NF-1A, NF-1B, NF-1C, NF-1X); or a RF-X factor (e.g. RF-X1, RF-X2,RF-X3, RF-X4, RF-X5, ANK).

In some embodiments, the transcription factor delivered to a cell by asystem or method of this invention is a zinc-coordinating DNA-bindingdomain containing transcription factor, for example, a Cys4 zinc fingernuclear receptor type factor (e.g., a steroid hormone receptor, athyroid hormone receptor-like factor); a Cys4 zinc finger factor (e.g.,a GATA-factor); a Cys2His2 zinc finger domain factor (e.g., a ubiquitousfactor including, for example, TFIIIA and Sp1, a developmental/cellcycle regulator, including, for example Krüppel and Krüppel-like factors(e.g., Klf2, Klf4); a factor with NF-6B-like binding properties; a Cys6cysteine-zinc cluster factor; or a zinc finger factors of alternatingcomposition.

In some embodiments, the transcription factor delivered to a cell by asystem or method of this invention is a helix-turn-helix transcriptionfactor, for example, a homeo domain factor (e.g., a homeo domain onlyfactor, including, for example, Ubx, a POU domain factor, for example;POU5F1 or other Oct-family transcription factor, a homeo domain with LIMregion factor, a homeo domain plus zinc finger motif factor); a pairedbox transcription factor (e.g., a paired plus homeo domain factor, or apaired domain only factor); a forkhead/winged helix factor (e.g., adevelopmental regulator, for example FOXD3, a tissue-specific regulator,a cell-cycle controlling factor); a heat shock factor; a tryptophancluster factor (e.g., a Myb factor, an Ets-type factor, an interferonregulatory factor); a transcriptional enhancer domain factor (e.g. a TEAfactor, for example, TEAD1, TEAD2, TEAD3, or TEAD4).

In some embodiments, the transcription factor delivered to a cell by asystem or method of this invention is a beta-scaffold factor with minorgroove contacts, for example, a Rel-homology region factor (e.g. aRel/ankyrin factor, a NF-kappaB factor, an ankyrin only factor, aNuclear Factor of Activated T-cells, for example NFATC1, NFATC2, orNFATC3); a STAT factor (e.g. a STAT family factor, for example, STAT3);a p53 factor; a MADS box factor (e.g. a regulator of differentiation,for example, Mef2, a responders to external signals, for example, SRFserum response factor (SRF)); a beta-barrel alpha-helix transcriptionfactor (e.g., a TATA binding protein, a sry-box only factor, forexample, Sox2, a TCF factor, for example, TCF1, a HMG2-related factor,for example, SSRP1, a MATA factor); a heteromeric CCAAT factor; agrainyhead factor; a cold-shock domain factor; a runt factor.

In some embodiments, the transcription factor delivered to a cell by asystem or method of this invention is a copper fist proteintranscription factor, a HMG factor (e.g., a HMGI-Y or HMGA1), a pocketdomain transcription factor, a E1A-like factor, a AP2/EREBP-relatedfactor, an ARF factor, an ABI factor, or a RAV factor.

Other transcription factors suitable for delivery to a cell by systemsor methods provided herein will be apparent to those of skill in theart.

In some embodiments, a transcription factor or other protein specificfor a pluripotent cell type is delivered to a somatic cell by a systemor method provided herein. In some embodiments, a transcription factoror other protein specific for embryonic stem cells (ES cells) isdelivered to a somatic cell by a system or method provided herein.Transcription factors and other proteins specific for a pluripotent celltype and/or ES cells are well known to those of skill in the art, andinclude Oct4, Sox2, Klf4, Nanog, Lin-28, and c-Myc (see, for example,Takahashi et al., Cell 126(4):663-76; Takahashi et al., Cell131(5):861-72, 2007, Yu et al., Science 318(5858):1917-20, 2007).

In some embodiments, a reprogramming factor or a combination ofreprogramming factors is delivered to a somatic cell in an amount andfor a time period sufficient to reprogram the somatic cell to apluripotent state. Reprogramming factors are well known in the art andinclude transcription factors and other proteins specific for apluripotent cell type, for example, ES cells, as described herein,including, but not limited to Oct4 (POU5F1), Sox2, Klf4, and c-myc, aswell as Nanog and Lin-28. Combinations of transcription factorssufficient for reprogramming are well known in the art and include, forexample, the following combinations: Oct4, Sox2, Klf4, and c-Myc; Oct4,Sox2, Nanog, and Lin-28; Oct4, Klf4, and c-Myc; Oct4 and Sox2; Oct4 andKlf4. It is well established in the art that expression of reprogrammingfactors, or of combinations of reprogramming factors, in somatic cells,for example, in primary fibroblast cells or terminally differentiatedlymphocyte cells of an adult subject, results in the reprogramming ofsome of these cells to a pluripotent state. See, for example, Takahashiet al., Cell, 2007, Yu et al., Science 2007, Okita et al., Nature448(7151):313-7, 2007, Takahashi et al., Nature Protocols 2(12):3081-9;2007, Wernig et al., Nature 448(7151):318-24, 2007, and Hanna et al.,Cell, 133(2):250-64, 2008). Initially, reprogramming technology reliedon expression of reprogramming factors in target cells from viralvectors. In some embodiments of this invention, the drawback of viralintegration and, thus, undesired modification of the cell genome isovercome by delivering a suitable combination of reprogramming factorsto a target cell. Methods for direct delivery of reprogramming factorsassociated with protein transduction domains (e.g. TAT and Arg₉) tosomatic cells are known to those in the art (Pan et al., MolecularBiology Reports, PMID 19669668, 2009, Zhou et al., Cell Stem Cell4:381-84, 2009, and Kim et al., Cell stem Cell 4:472-76, 2009). However,such methods are either very inefficient or rely on additionaladministration of a compound perturbing the epigenetic state of thetarget cell.

In contrast to methods employing conventional protein transductiondomains, some systems or methods for the delivery of functional proteinsto a cell as provided herein are highly efficient allowing to effect ahigher end-concentration of a functional protein in a target cell whilereducing or eliminating toxicity and cell viability problems oftenaccompanying and limiting systems employing conventional proteintransduction domains (see Example 7).

In some embodiments, a target cell, for example, a somatic cell, iscontacted with a reprogramming factor or a combination of reprogrammingfactors associated with a supercharged protein provided herein. In someembodiments, a target cell is contacted with a combination including 2-5reprogramming factors associated with a supercharged protein providedherein. In some embodiments, a target cell is contacted with acombination including 2, 3, 4, 5, 6, 7, 8, 9, 10, or more reprogrammingfactors associated with a supercharged protein provided herein. In someembodiments the target cell is a primary somatic cell and is contactedin vitro or ex vivo with a reprogramming factor.

In some embodiments, a target cell is contacted, or repeatedlycontacted, with a combination of reprogramming factors associated with asupercharged protein as provided herein until the formation of apluripotent cell is detected. Methods for detecting pluripotent cellsare well known to those in the art and include, for example,morphological analysis, and detection of pluripotency-associated markerexpression (e.g., SSEA1, SSEA4, Oct4, Sox2, and Nanog) by wellestablished methods such as immunohistochemistry, fluorescence activatedcell sorting (FACS), or fluorescent microscopy. In some embodiments, atarget cell is contacted with a combination of reprogramming factorsassociated with a supercharged protein as provided herein for a periodof at least about 10-12 days, at least about 12-15 days, at least about15-20 days, at least about 20-25 days, at least about 25-30 days, atleast about 30-40 days, at least about 40-50 days, at least about 50-60days, at least about 60-70, at least about 70-100 days.

In some embodiments, a reprogramming factor or a combination ofreprogramming factors is delivered to a somatic cell by a method orsystem as provided herein in an amount and for a time period effectiveto reprogram the cell to a pluripotent state. As will be apparent tothose of skill in the art, the amount necessary to reprogram a cell isdependent on various factors, for example, on the cell type and thetreatment schedule. For example, it has been reported that primaryfibroblast cells need a minimal time of 10-16 days of exposure tovirally expressed reprogramming factors in order to subsequentlyestablish a state of pluripotency (Brambrink et al., Cell Stem Cell2(2):151-59, 2008; Stadtfeld et al., Cell Stem Cell 2(3):230-40, 2008).It has been demonstrated that fibroblast cells can be reprogrammed bycontacting the cells with a combination of reprogramming factorsassociated with a conventional protein transduction domain for a periodof time followed by an incubation period in media not containingreprogramming factors and repeating this cycle several times untilreprogrammed cells are detectable. For example, human fibroblast cellscan be reprogrammed with a combination of Oct4, Sox2, Klf4, and c-Mycfused to a Arg9 protein transduction domain by incubation with wholecell extracts from cells expressing the reprogramming factors for 16 h,subsequent washing and incubation in ES cell media for several days, andrepetition of this cycle for 4-6 times (Kim et al., Cell 2009). Ingeneral, delivery of reprogramming factors to a target somatic cell by asystem or method provided herein will be at a concentration below aconcentration at which significant toxicity can be observed. Thecritical concentration will depend, for example, on the reprogrammingfactor, the supercharged protein it is associated with, the type ofassociation, and the type of cell being treated.

A useful concentration of a transcription factor associated with asupercharged protein for delivery of the transcription factor to aspecific cell type can be established by those of skill in the art byroutine experimentation. In some embodiments a target cell is contactedin vitro or ex vivo with a reprogramming factor at a concentration ofabout 1 pM to about 1 μM. In some embodiments, a target cell iscontacted in vitro or ex vivo with a reprogramming factor at aconcentration of about 1 pM, about 2.5 pM, about 5 pM, about 7.5 pM,about 10 pM, about 20 pM, about 25 pM, about 30 pM, about 40 pM, about50 pM, about 60 pM, about 70 pM, about 75 pM, about 80 pM, about 90 pM,about 100 pM, about 200 pM, about 250 pM, about 300 pM, about 400 pM,about 500 pM, about 600 pM, about 700 pM, about 750 pM, about 800 pM,about 900 pM, about 1 nM, about 2 nM, about 3 nM, about 4 nM, about 5nM, about 6 nM, about 7 nM, about 8 nM, about 9 nM, about 10 nM, about20 nM, about 25 nM, about 30 nM, about 40 nM, about 50 nM, about 60 nm,about 70 nM, about 75 nM, about 80 nM, about 90 nM, about 100 nM, about200 nM, about 250 nM, about 300 nM, about 400 nM, about 500 nM, about600 nM, about 700 nM, about 750 nM, about 800 nM, about 900 nM, or aboutA useful time of reprogramming factor administration, and, if necessary,incubation after administration in the absence of reprogramming factors,as well as a number of administration/incubation cycles useful toachieve reprogramming of a cell of a given cell type can also beestablished by those of skill in the art by routine experimentation.

In some embodiments, the target cell for delivery of a reprogrammingfactor or a combination of reprogramming factors by a system or methodprovided herein, is a primary cell obtained by a biopsy from a subject.In some embodiments, the subject is diagnosed to have a disease. In someembodiments the disease is a degenerative disease characterized bydiminished function of a specific cell type, for example, a neural cell.In some embodiments, a somatic cell obtained from a subject isreprogrammed to a pluripotent state by delivery of a reprogrammingfactor or a combination of reprogramming factors associated with asupercharged protein as provided herein. In some embodiments, apluripotent cell is isolated after reprogramming of a somatic cell froma subject by transcription factors delivered by a system or methodprovided herein. In some embodiments, a pluripotent cell obtained afterreprogramming of a somatic cell from a subject or differentiated progenyof such a pluripotent cell is used in a cell-replacement therapeuticapproach. In some embodiments, cells of a cell type that is defective orexhibits diminished function in the subject are differentiated from areprogrammed, pluripotent cell isolated from a somatic cell populationobtained from the subject after reprogramming by methods or systemsprovided herein. In some embodiments the reprogrammed cells or theirdifferentiated progeny are administered to the subject from which thesomatic cell was obtained in an autologous cell replacement therapeuticapproach.

Methods for the culture and selection of reprogrammed, pluripotent cellsare well known to those in the art. Methods for differentiation of suchcells into functional differentiated cell types are also well known formany cell types of therapeutic interest and will be apparent to those ofskill in the art. In general, a method useful for the differentiation ofembryonic stem cells will be applicable to the differentiation ofreprogrammed, pluripotent cells.

In some embodiments, a transcription factor able to convert a cell fromone differentiated state into another is delivered to a target cell invitro or in vivo by a system or method provided herein. Transcriptionfactors that effect transdifferentiation are known in the art (see,e.g., Zhou et al., Nature 455:627-33, 2008). In some embodiments, acombination of the transcription factors Ngn3, Pdx1, and Mafa aredelivered to a differentiated pancreatic exocrine cell by a system ormethod as provided by this invention. It is known in the art thatexpression of a combination of these transcription factors results inthe reprogramming of differentiated pancreatic exocrine cells toinsulin-producing β-cells (Zhou et al., Nature 455:627-33, 2008). Insome embodiments, a reprogrammed insulin-producing β-cells is derivedfrom a subject having a deficiency in insulin-producing β-cells and isused in a cell-replacement therapeutic approach involving the subject.

Nucleases

In some embodiments, a nuclease is delivered to a target cell by asystem or method provided herein. In some embodiments, a zinc-fingernuclease is delivered to a target cell by a system or method providedherein.

Zinc finger nucleases are a class of artificial nucleases that comprisea DNA cleavage domain and a zinc finger DNA binding domain. In someembodiments, the DNA cleavage domain is a non-specific DNA cleavagedomain of a restriction endonuclease, for example, of FokI. In someembodiments, the DNA cleavage domain is a domain that only cleavesdouble-stranded DNA when dimerized with a second DNA cleavage domain ofthe same type. In some embodiments, the DNA cleavage domain is fused tothe C-terminus of the zinc finger domain via a linker, for example, apeptide linker. In some embodiments, the zinc finger domain comprisesbetween about 3 and about 6 zinc fingers and specifically recognizes andbinds a target sequence of about 9-20 nucleotides in length. In someembodiments, a plurality of zinc finger nuclease molecules is deliveredto a target cell by a system or method provided by this invention, withthe zinc finger domain of one zinc finger nuclease molecule binding atarget sequence in close proximity of the target sequence of a secondzinc finger nuclease molecule. In some embodiments, the zinc fingerdomains of the zinc finger nuclease molecules binding target sequencesin close proximity to each other are different. In some embodiments, azinc finger nuclease molecule delivered to a cell by a system or methodprovided herein binds a target nucleic acid sequence in close proximityto the target sequence of another zinc finger nuclease molecule, so thatthe DNA cleavage domains of the molecules dimerize and cleave a DNAmolecule at a site between the two target sequences.

In some embodiments, the genome of the target cell is edited by azinc-finger nuclease or a plurality of zinc finger nucleases targeting aspecific sequence of the genome after delivery. In some embodiments, adouble-strand break is introduced at a specific site within the genomeof a target cell by a zinc finger nuclease or a plurality of zinc fingernucleases, resulting in a disruption of the targeted genomic sequence.In some embodiments, the targeted genomic sequence is a nucleic acidsequence within the coding region of a gene. In some embodiments, thedouble-strand break introduced by the zinc finger nuclease or theplurality of zinc finger nucleases leads to a mutation within the targetgene that impairs the expression of the respective gene product.

In some embodiments, the delivery of a zinc finger nuclease to a targetcell results in a clinically or therapeutically beneficial disruption ofthe function of a specific gene. For example, in some embodiments, azinc-finger nuclease targeting a nucleic acid sequence within the humanCCR5 gene is delivered to T-cells of a human subject, for example, asubject diagnosed with an HIV infection/AIDS, by systems or methodsprovided herein, and zinc-finger-mediated editing of the CCR5 gene inthe target T-cells leads to a loss-of function CCR5 gene mutationassociated with resistance to HIV infection. In some embodiments, themutation effected in the target T-cells by the zinc finger nucleasemimics the naturally occurring CCR5A32 mutation (Kim et al., GenomeResearch, 19:1279-88, 2009).

In some embodiments, cells from a subject are obtained and a zinc fingernuclease is delivered to the cells by a system or method provided hereinex vivo. In some embodiments, the treated cells are selected for thosecells in which a desired zinc-finger nuclease-mediated genomic editingevent has been effected. In some embodiments, treated cells carrying adesired genomic mutation or alteration are returned to the subject theywere obtained from. For example, in some embodiments, CD4⁺ T-lymphocytesare obtained from a subject diagnosed with HIV/AIDS and a zinc fingernuclease targeting a specific site within the CCR5 gene is delivered tothe cells ex vivo. In some embodiments, CD4⁺ T-lymphocytes with thedesired CCR5 mutation, for example, a mutation mimicking the naturallyoccurring CCR5Δ32 mutation, are selected and isolated by methods wellknown to those of skill in the art. In some embodiments, the cells arereturned to the subject they were obtained from after the desiredzinc-finger nuclease-mediated CCR5 mutation has been achieved.

Methods for engineering, generation, and isolation of zinc-fingernucleases targeting specific sequences and editing cellular genomes atspecific target sequences, including at sequences within the human CCR5gene, are well known in the art (see, e.g., Mani et al., Biochemical andBiophysical Research Communications 335:447-457, 2005; Perez et al.,Nature Biotechnology 26:808-16, 2008; Kim et al., Genome Research,19:1279-88, 2009; Urnov et al., Nature 435:646-51, 2005; Carroll et al.,Gene Therapy 15:1463-68, 2005; Lombardo et al., Nature Biotechnology25:1298-306, 2007; Kandavelou et al., Biochemical and BiophysicalResearch Communications 388:56-61, 2009; and Hockemeyer et al., NatureBiotechnology 27(9):851-59, 2009).

Recombinases

In some embodiments, a recombinase is delivered to a target cell by asystem or method as provided herein. In some embodiments, the genome ofthe target cell comprises a nucleotide sequence recognized by therecombinase to be delivered. In some embodiments, the recombinase is Crerecombinase and the genome of the target cell comprises a loxP site. Insome embodiments, the recombinase is FLP recombinase and the target cellcomprises a flp recombination site. In some embodiments, the recombinaseis Dre recombinase and the genome of the target cell comprises a roxrecombination site. In some embodiments, the recombinase is delivered toa target cell comprising a reporter gene flanked by two recombinationsites to loop out the reporter gene. In some embodiments, therecombinase is delivered to a target cell comprising a recombinationsite recognized by the recombinase in its genome in temporal proximitywith the delivery of a nucleic acid construct comprising a recombinationsite recognized by the recombinase to insert part of the nucleic acidinto the target cell genome by recombinase-mediated recombination.Methods for recombinase-mediated excision of reporter genes and therecombinase-mediated insertion of nucleic acid constructs are well knownin the art (see, e.g., Beard et al., Genesis 44(1):23-28), Nolden etal., Methods in Molecular Medicine 140:17-32, 2007, Anastassiadis etal., Disease Models and Mechanisms 2(9-10):508-15, 2009).

In some embodiments, a target cell comprises a recombination site, forexample, a loxP site or a flp site, and a recombinase recognizing therecombination site, for example Cre or FLP recombinase, is delivered tothe target cell by a system or method provided herein. In someembodiments, the target cell comprises two recombination sites, forexample, two loxP or flp sites flanking a gene of interest, or a part ofa gene of interest, for example an exon or a promoter region, or areporter cassette previously introduced into the cell. In someembodiments, delivery of a recombinase effects recombinase-mediatedexcision, also referred to as loopout, of a nucleic acid sequenceflanked by recombination sites in the target cell. In some embodiments,the delivery of a recombinase to a target cell by a system or methodprovided by this invention effects the excision of a nucleic acidsequence of about 500b, about 1 kb, about 2 kb, about 2-5 kb, about 5-10kb, about 10-20 kb, about 20-50 kb, about 50-100 kb, about 100-200 kb,or more than about 200 kb.

Small Molecules

The present invention provides systems and methods for delivery of smallmolecules to cells in vivo or in vitro. Such systems and methodstypically involve association of one or more small molecules withsupercharged proteins to form a complex, and delivery of the complex toone or more cells. In some embodiments, the small molecule may havetherapeutic activity. Preferably, though not necessarily, the drug isone that has already been deemed safe and effective for use in humans oranimals by the appropriate governmental agency or regulatory body. Incertain embodiments, the small molecule is a drug approved by the U.S.Food and Drug Administration for use in humans or other animals. Forexample, drugs approved for human use are listed by the FDA under 21C.F.R. §§330.5, 331 through 361, and 440 through 460, incorporatedherein by reference; drugs for veterinary use are listed by the FDAunder 21 C.F.R. §§500 through 589, incorporated herein by reference. Alllisted drugs are considered acceptable for use in accordance with thepresent invention. In some embodiments, delivery of the complex to cellsinvolves administering a complex comprising supercharged proteinsassociated with a small molecule to a subject in need thereof. In someembodiments, a small molecule by itself may not be able to enter theinterior of a cell, but is able to enter the interior of a cell whencomplexed with a supercharged protein. In some embodiments, asupercharged protein is utilized to allow a small molecule to enter acell.

Formation of Complexes

The present invention provides complexes comprising superchargedproteins associated with one or more agents to be delivered. In someembodiments, supercharged proteins are associated with one or moreagents to be delivered by non-covalent interactions. In someembodiments, supercharged proteins are associated with one or morenucleic acids by electrostatic interactions. In certain embodiments,supercharged proteins have an overall net positive charge, and the agentto be delivered such as nucleic acids have an overall net negativecharge.

In certain embodiments, supercharged proteins are associated with one ormore agents to be delivered by covalent interactions. For example, asupercharged protein may be fused to a peptide or protein to bedelivered. Covalent interaction may be direct or indirect. In someembodiments, such covalent interactions are mediated by one or morelinkers. In some embodiments, the linker is a cleavable linker. Incertain embodiments, the cleavable linker comprises an amide, ester, ordisulfide bond. For example, the linker may be an amino acid sequencethat is cleavable by a cellular enzyme. In certain embodiments, theenzyme is a protease. In other embodiments, the enzyme is an esterase.In some embodiments, the enzyme is one that is more highly expressed incertain cell types than in other cell types. For example, the enzyme maybe one that is more highly expressed in tumor cells than in non-tumorcells. Exemplary linkers and enzymes that cleave those linkers arepresented in Table 3.

TABLE 3 Cleavable Linkers Linker Sequence Enzyme(s) Targeting LinkerX¹-AGVF-X lysosomal thiol proteinases (see, e,g., Duncan et (SEQ ID NO: 90) al., 1982, Biosci, Rep., 2: 1041-46; incorporated herein by reference) X-GFLG-Xlysosomal cysteine proteinases (see, e.g., Vasey et (SEQ ID NO: 91)al., Clin. Canc. Res., 1999, 5: 83-94; incorporated herein by reference)X-FK-X Cathepsin B - ubiquitous, overexpressed in many solid(SEQ ID NO: 92) tumors, such as breast cancer (see, e.g., Dubowchiket al., 2002, Bioconjugate Chem., 13: 855-69;incorporated herein by reference) X-A*L-XCathepsin B - ubiquitous, overexpressed in many solid (SEQ ID NO: 93)tumors, such as breast cancer (see, e.g., Trouet etal., 1982, Proc. Natl. Acad Sci., USA, 79: 626-29;incorporated herein by reference) X-A*LA*L-XCathepsin B - ubiquitous, overexpressed in many solid (SEQ ID NO: 94)tumors (see, e.g., Schmid et al., 2007, BloconjugateChem, 18: 702-16; incorporated herein by reference) X-AL*AL*A-XCathepsin D - ubiquitous (see, e,g., Czerwinski et (SEQ ID NO: 95)al., 1998, Proc. Natl. Acad Sci., USA, 95: 11520-25;incorporated herein by reference) ¹X denotes a supercharged proteinand/or agent to be delivered *refers to observed cleavage site

To give but one particular example, a +36 GFP may be associated with anagent to be delivered by a cleavable linker, such as ALAL (SEQ ID NO:96), to generate +36 GFP-(GGS)₄-ALAL-(GGS)₄-X (SEQ ID NO: 154; where Xis the agent to be delivered).

In certain embodiments, the agent to be delivered is a nucleic acid. Insome embodiments, complexes are formed by incubating superchargedproteins with nucleic acids. In some embodiments, formation of complexesis carried out in a buffered solution. In some embodiments, formation ofcomplexes is carried out at or around pH 7. In some embodiments,formation of complexes is carried out at about pH 5, about pH 6, aboutpH 7, about pH 8, or about pH 9. Formation of complexes is typicallycarried out at a pH that does not negatively affect the function of thesupercharged protein and/or nucleic acid.

In some embodiments, formation of complexes is carried out at roomtemperature. In some embodiments, formation of complexes is carried outat or around 37° C. In some embodiments, formation of complexes iscarried out below 4° C., at about 4° C., at about 10° C., at about 15°C., at about 20° C., at about 25° C., at about 30° C., at about 35° C.,at about 37° C., at about 40° C., or higher than 40° C. Formation ofcomplexes is typically carried out at a temperature that does notnegatively affect the function of the supercharged protein and/ornucleic acid.

In some embodiments, formation of complexes is carried out in serum-freemedium. In some embodiments, formation of complexes is carried out inthe presence of CO₂ (e.g., about 1%, about 2%, about 3%, about 4%, about5%, about 6%, or more).

In some embodiments, formation of complexes is carried out usingconcentrations of nucleic acid of about 100 nm. In some embodiments,formation of complexes is carried out using concentrations of nucleicacid of about 25 nM, about 50 nM, about 75 nM, about 90 nM, about 100nM, about 110 nM, about 125 nM, about 150 nM, about 175 nM, or about 200nM. In some embodiments, formation of complexes is carried out usingconcentrations of supercharged protein of about 40 nM. In someembodiments, formation of complexes is carried out using concentrationsof supercharged protein of about 10 nM, about 20 nM, about 30 nM, about40 nM, about 50 nM, about 60 nM, about 70 nM, about 80 nM, about 90 nM,or about 100 nM.

In some embodiments, formation of complexes is carried out underconditions of excess nucleic acid. In some embodiments, formation ofcomplexes is carried out with ratios of nucleic acid:superchargedprotein of about 20:1, about 10:1, about 9:1, about 8:1, about 7:1,about 6:1, about 5:1, about 4:1, about 3:1, about 2:1, or about 1:1. Insome embodiments, formation of complexes is carried out with ratios ofnucleic acid:supercharged protein of about 3:1. In some embodiments,formation of complexes is carried out with ratios of superchargedprotein:nucleic acid of about 20:1, about 10:1, about 9:1, about 8:1,about 7:1, about 6:1, about 5:1, about 4:1, about 3:1, about 2:1, orabout 1:1.

In some embodiments, formation of complexes is carried out by mixingsupercharged protein with nucleic acid, and agitating the mixture (e.g.,by inversion). In some embodiments, formation of complexes is carriedout by mixing supercharged protein with nucleic acid, and allowing themixture to sit still. In some embodiments, the formation of the complexis carried out in the presence of a pharmaceutically acceptable carrieror excipient. In some embodiments, the complex is further combined witha pharmaceutically acceptable carrier or excipient. Exemplary excipientsor carriers include water, solvents, lipids, proteins, peptides,endosomolytic agents (e.g., chloroquine, pyrene butyric acid), smallmolecules, carbohydrates, buffers, natural polymers, synthetic polymers(e.g., PLGA, polyurethane, polyesters, polycaprolactone,polyphosphazenes), pharmaceutical agents, etc.

In some embodiments, complexes comprising supercharged protein andnucleic may migrate more slowly in gel electrophoresis assays thaneither the supercharged protein alone or the nucleic acid alone.

Applications

The present invention provides supercharged proteins or complexescomprising supercharged proteins, naturally occurring or engineered,associated with agents to be delivered, as well as methods for usingsuch complexes. Any agent may be delivered using the inventive system.In the case of delivering nucleic acids, since nucleic acids generallyhave net negative charges, supercharged proteins that associate withnucleic acids are typically superpositively charged proteins. Theinventive supercharged proteins or complexes may be used to treat orprevent any disease that can benefit, e.g., from the delivery of anagent to a cell. The inventive supercharged proteins or complexes mayalso be used to transfect or treat cells for research purposes.

In some embodiments, supercharged proteins or complexes in accordancewith the invention may be used for research purposes, e.g., toefficiently deliver nucleic acids to cells in a research context. Insome embodiments, supercharged proteins may be used as research tools toefficiently transform cells with nucleic acids. In some embodiments,supercharged proteins may be used as research tools to efficientlyintroduce RNAi agents into cells for purposes of studying RNAimechanisms. In some embodiments, supercharged proteins may be used asresearch tools to silence genes in a cell. In certain embodiments,supercharged proteins may be used to deliver a peptide or protein into acell for the purpose of studying the biological activity of the peptideor protein. In certain embodiments, supercharged proteins may beintroduced into a cell for the purpose of studying the biologicalactivity of the peptide or protein. In certain embodiments, superchargedproteins may be used to deliver a small molecule into a cell for thepurpose of studying the biological activity of the small molecule.

In some embodiments, supercharged proteins or complexes in accordancewith the present invention may be used for therapeutic purposes. In someembodiments, supercharged proteins or complexes in accordance with thepresent invention may be used for treatment of any of a variety ofdiseases, disorders, and/or conditions, including but not limited to oneor more of the following: autoimmune disorders (e.g. diabetes, lupus,multiple sclerosis, psoriasis, rheumatoid arthritis); inflammatorydisorders (e.g. arthritis, pelvic inflammatory disease); infectiousdiseases (e.g. viral infections (e.g., HIV, HCV, RSV), bacterialinfections, fungal infections, sepsis); neurological disorders (e.g.Alzheimer's disease, Huntington's disease; autism; Duchenne musculardystrophy); cardiovascular disorders (e.g. atherosclerosis,hypercholesterolemia, thrombosis, clotting disorders, angiogenicdisorders such as macular degeneration); proliferative disorders (e.g.cancer, benign neoplasms); respiratory disorders (e.g. chronicobstructive pulmonary disease); digestive disorders (e.g. inflammatorybowel disease, ulcers); musculoskeletal disorders (e.g. fibromyalgia,arthritis); endocrine, metabolic, and nutritional disorders (e.g.diabetes, osteoporosis); urological disorders (e.g. renal disease);psychological disorders (e.g. depression, schizophrenia); skin disorders(e.g. wounds, eczema); blood and lymphatic disorders (e.g. anemia,hemophilia); etc.

Supercharged proteins or complexes of the invention may be used in aclinical setting. For example, a supercharged protein may be associatedwith a nucleic acid that can be used for therapeutic applications. Suchnucleic acids may include functional RNAs that are used to reduce levelsof one or more target transcripts (e.g., siRNAs, shRNAs, microRNAs,antisense RNAs, ribozymes, etc.). In some embodiments, a disease,disorder, and/or condition may be associated with abnormally high levelsof one or more particular mRNAs and/or proteins. To give but oneparticular example, many forms of breast cancer are associated withincreased expression of the epidermal growth factor receptor (EGFR).Supercharged proteins may be utilized to deliver an RNAi agent thattargets EGFR mRNA to cells (e.g., breast cancer tumor cells).Supercharged proteins may be efficiently taken up by tumor cells,resulting in delivery of the RNAi agent. Upon delivery, the RNAi agentmay be effective to reduce levels of EGFR mRNA, thereby reducing levelsof EGFR protein. Such a method may be an effective treatment for breastcancers (e.g., breast cancers associated with elevated levels of EGFR).One of ordinary skill in the art will recognize that similar methods maybe used to treat any disease, disorder, and/or condition that isassociated with elevated levels of one or more particular mRNAs and/orproteins.

In some embodiments, a disease, disorder, and/or condition may beassociated with abnormally low levels of one or more particular mRNAsand/or proteins. To give but one particular example, tyrosinemia is adisorder in which the body cannot effectively break down the amino acidtyrosine. There are three types of tyrosinemia, each caused by adeficiency in a different enzyme. Supercharged proteins may be used totreat tyrosinemia by delivering a vector that drives expression of thedeficient enzyme. Upon delivery of the vector to cells, cellularmachinery can direct expression of the deficient enzyme, therebytreating a patient's tyrosinemia. One of ordinary skill in the art willrecognize that similar methods may be used to treat any disease,disorder, and/or condition that is associated with abnormally low levelsof one or more particular mRNAs and/or proteins.

As demonstrated in Examples 2 and 3, supercharged protein-based nucleicacid delivery to cells is successful, even using cell lines that areresistant to nucleic acid transfection using conventional cationiclipid-based transfection methods. Thus, in some embodiments,supercharged proteins are utilized to deliver nucleic acids to cellswhich are resistant to other methods of nucleic acid delivery (e.g.,cationic lipid-based transformation methods, such as use oflipofectamine). Furthermore, the present inventors have demonstratedthat, surprisingly, superpositively charged proteins can be used at lownanomolar (nM) concentrations (e.g., 1 nm to 100 nm) to effectivelydeliver nucleic acids to cells. In some embodiments, superchargedproteins can be used at about 1 nm, about 5 nm, about 10 nm, about 25nm, about 50 nm, about 75 nm, about 100 nm, or higher than about 100 nmto effectively deliver nucleic acids to cells.

In some embodiments, a supercharged protein may be a therapeutic agent.For example, a supercharged protein may be a supercharged variant of aprotein drug (e.g., abatacept, adalimumab, alefacept, erythropoietin,etanercept, human growth hormone, infliximab, insulin, trastuzumab,interferons, etc.). In some embodiments, a supercharged protein may be atherapeutic agent, and an associated nucleic acid may be useful fortargeting delivery of the therapeutic protein to a target site. Forexample, a supercharged protein may be a supercharged variant of aprotein drug (e.g., abatacept, adalimumab, alefacept, erythropoietin,etanercept, human growth hormone, infliximab, insulin, trastuzumab,interferons, etc.), and an associated nucleic acid may be an aptamerthat efficiently targets the therapeutic protein to a target organ,tissue, and/or cell. The supercharged protein can also be an imaging,diagnostic, or other detection agent.

In some embodiments, one or both of the supercharged protein and anagent to be delivered (if present) may have detectable qualities. Forexample, one or both of the supercharged protein and the agent maycomprise at least one fluorescent moiety. In some embodiments, thesupercharged protein has inherent fluorescent qualities (e.g., GFP). Insome embodiments, one or both of the supercharged protein and the agentto be delivered may be associated with at least one fluorescent moiety(e.g., conjugated to a fluorophore, fluorescent dye, etc.).Alternatively or additionally, one or both of the supercharged proteinand the agent to be delivered may comprise at least one radioactivemoiety (e.g., protein may comprise ³⁵S; nucleic acid may comprise ³²P;etc.). Such detectable moieties may be useful for detecting and/ormonitoring delivery of the supercharged proteins or complexes to targetsites.

In some embodiments, the supercharged protein or an agent associatedwith a supercharged protein includes a detectable label. These moleculescan be used in detection, imaging, disease staging, diagnosis, orpatient selection. Suitable labels include fluorescent,chemiluminescent, enzymatic labels, colorimetric, phosphorescent,density-based labels, e.g., labels based on electron density, and ingeneral contrast agents, and/or radioactive labels.

Pharmaceutical Compositions

The present invention provides supercharged proteins and complexescomprising supercharged proteins associated with at least one agent tobe delivered. Thus, the present invention provides pharmaceuticalcompositions comprising one or more supercharged proteins or one or moresuch complexes, and one or more pharmaceutically acceptable excipients.Pharmaceutical compositions may optionally comprise one or moreadditional therapeutically active substances. In accordance with someembodiments, a method of administering pharmaceutical compositionscomprising one or more supercharged proteins or one or more complexescomprising supercharged proteins associated with at least one agent tobe delivered to a subject in need thereof is provided. In someembodiments, compositions are administered to humans. For the purposesof the present disclosure, the phrase “active ingredient” generallyrefers to a supercharged protein or complex comprising a superchargedprotein and at least one agent to be delivered as described herein.

Although the descriptions of pharmaceutical compositions provided hereinare principally directed to pharmaceutical compositions which aresuitable for administration to humans, it will be understood by theskilled artisan that such compositions are generally suitable foradministration to animals of all sorts. Modification of pharmaceuticalcompositions suitable for administration to humans in order to renderthe compositions suitable for administration to various animals is wellunderstood, and the ordinarily skilled veterinary pharmacologist candesign and/or perform such modification with merely ordinary, if any,experimentation. Subjects to which administration of the pharmaceuticalcompositions is contemplated include, but are not limited to, humansand/or other primates; mammals, including commercially relevant mammalssuch as cattle, pigs, horses, sheep, cats, dogs, mice, and/or rats;and/or birds, including commercially relevant birds such as chickens,ducks, geese, and/or turkeys.

Formulations of the pharmaceutical compositions described herein may beprepared by any method known or hereafter developed in the art ofpharmacology. In general, such preparatory methods include the step ofbringing the active ingredient into association with an excipient and/orone or more other accessory ingredients, and then, if necessary and/ordesirable, shaping and/or packaging the product into a desired single-or multi-dose unit.

A pharmaceutical composition in accordance with the invention may beprepared, packaged, and/or sold in bulk, as a single unit dose, and/oras a plurality of single unit doses. As used herein, a “unit dose” isdiscrete amount of the pharmaceutical composition comprising apredetermined amount of the active ingredient. The amount of the activeingredient is generally equal to the dosage of the active ingredientwhich would be administered to a subject and/or a convenient fraction ofsuch a dosage such as, for example, one-half or one-third of such adosage.

Relative amounts of the active ingredient, the pharmaceuticallyacceptable excipient, and/or any additional ingredients in apharmaceutical composition in accordance with the invention will vary,depending upon the identity, size, and/or condition of the subjecttreated and further depending upon the route by which the composition isto be administered. By way of example, the composition may comprisebetween 0.1% and 100% (w/w) active ingredient.

Pharmaceutical formulations may additionally comprise a pharmaceuticallyacceptable excipient, which, as used herein, includes any and allsolvents, dispersion media, diluents, or other liquid vehicles,dispersion or suspension aids, surface active agents, isotonic agents,thickening or emulsifying agents, preservatives, solid binders,lubricants and the like, as suited to the particular dosage formdesired. Remington's The Science and Practice of Pharmacy, 21^(st)Edition, A. R. Gennaro (Lippincott, Williams & Wilkins, Baltimore, Md.,2006; incorporated herein by reference) discloses various excipientsused in formulating pharmaceutical compositions and known techniques forthe preparation thereof. Except insofar as any conventional excipientmedium is incompatible with a substance or its derivatives, such as byproducing any undesirable biological effect or otherwise interacting ina deleterious manner with any other component(s) of the pharmaceuticalcomposition, its use is contemplated to be within the scope of thisinvention.

In some embodiments, a pharmaceutically acceptable excipient is at least95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%pure. In some embodiments, an excipient is approved for use in humansand for veterinary use. In some embodiments, an excipient is approved byUnited States Food and Drug Administration. In some embodiments, anexcipient is pharmaceutical grade. In some embodiments, an excipientmeets the standards of the United States Pharmacopoeia (USP), theEuropean Pharmacopoeia (EP), the British Pharmacopoeia, and/or theInternational Pharmacopoeia.

Pharmaceutically acceptable excipients used in the manufacture ofpharmaceutical compositions include, but are not limited to, inertdiluents, dispersing and/or granulating agents, surface active agentsand/or emulsifiers, disintegrating agents, binding agents,preservatives, buffering agents, lubricating agents, and/or oils. Suchexcipients may optionally be included in pharmaceutical formulations.Excipients such as cocoa butter and suppository waxes, coloring agents,coating agents, sweetening, flavoring, and/or perfuming agents can bepresent in the composition, according to the judgment of the formulator.

Exemplary diluents include, but are not limited to, calcium carbonate,sodium carbonate, calcium phosphate, dicalcium phosphate, calciumsulfate, calcium hydrogen phosphate, sodium phosphate lactose, sucrose,cellulose, microcrystalline cellulose, kaolin, mannitol, sorbitol,inositol, sodium chloride, dry starch, cornstarch, powdered sugar, etc.,and/or combinations thereof.

Exemplary granulating and/or dispersing agents include, but are notlimited to, potato starch, corn starch, tapioca starch, sodium starchglycolate, clays, alginic acid, guar gum, citrus pulp, agar, bentonite,cellulose and wood products, natural sponge, cation-exchange resins,calcium carbonate, silicates, sodium carbonate, cross-linkedpoly(vinyl-pyrrolidone) (crospovidone), sodium carboxymethyl starch(sodium starch glycolate), carboxymethyl cellulose, cross-linked sodiumcarboxymethyl cellulose (croscarmellose), methylcellulose,pregelatinized starch (starch 1500), microcrystalline starch, waterinsoluble starch, calcium carboxymethyl cellulose, magnesium aluminumsilicate (Veegum), sodium lauryl sulfate, quaternary ammonium compounds,etc., and/or combinations thereof.

Exemplary surface active agents and/or emulsifiers include, but are notlimited to, natural emulsifiers (e.g. acacia, agar, alginic acid, sodiumalginate, tragacanth, chondrux, cholesterol, xanthan, pectin, gelatin,egg yolk, casein, wool fat, cholesterol, wax, and lecithin), colloidalclays (e.g. bentonite [aluminum silicate] and Veegum® [magnesiumaluminum silicate]), long chain amino acid derivatives, high molecularweight alcohols (e.g. stearyl alcohol, cetyl alcohol, oleyl alcohol,triacetin monostearate, ethylene glycol distearate, glycerylmonostearate, and propylene glycol monostearate, polyvinyl alcohol),carbomers (e.g. carboxy polymethylene, polyacrylic acid, acrylic acidpolymer, and carboxyvinyl polymer), carrageenan, cellulosic derivatives(e.g. carboxymethylcellulose sodium, powdered cellulose, hydroxymethylcellulose, hydroxypropyl cellulose, hydroxypropyl methylcellulose,methylcellulose), sorbitan fatty acid esters (e.g. polyoxyethylenesorbitan monolaurate [Tween®20], polyoxyethylene sorbitan [Tween®60],polyoxyethylene sorbitan monooleate [Tween®80], sorbitan monopalmitate[Span®40], sorbitan monostearate [Span®60], sorbitan tristearate[Span®65], glyceryl monooleate, sorbitan monooleate [Span®80]),polyoxyethylene esters (e.g. polyoxyethylene monostearate [Myrj®45],polyoxyethylene hydrogenated castor oil, polyethoxylated castor oil,polyoxymethylene stearate, and Solutol®), sucrose fatty acid esters,polyethylene glycol fatty acid esters (e.g. Cremophor®), polyoxyethyleneethers, (e.g. polyoxyethylene lauryl ether [Brij®30]),poly(vinyl-pyrrolidone), diethylene glycol monolaurate, triethanolamineoleate, sodium oleate, potassium oleate, ethyl oleate, oleic acid, ethyllaurate, sodium lauryl sulfate, Pluronic®F 68, Poloxamer®188,cetrimonium bromide, cetylpyridinium chloride, benzalkonium chloride,docusate sodium, etc. and/or combinations thereof.

Exemplary binding agents include, but are not limited to, starch (e.g.cornstarch and starch paste); gelatin; sugars (e.g. sucrose, glucose,dextrose, dextrin, molasses, lactose, lactitol, mannitol); natural andsynthetic gums (e.g. acacia, sodium alginate, extract of Irish moss,panwar gum, ghatti gum, mucilage of isapol husks,carboxymethylcellulose, methylcellulose, ethylcellulose,hydroxyethylcellulose, hydroxypropyl cellulose, hydroxypropylmethylcellulose, microcrystalline cellulose, cellulose acetate,poly(vinyl-pyrrolidone), magnesium aluminum silicate (Veegum®), andlarch arabogalactan); alginates; polyethylene oxide; polyethyleneglycol; inorganic calcium salts; silicic acid; polymethacrylates; waxes;water; alcohol; etc.; and combinations thereof.

Exemplary preservatives may include, but are not limited to,antioxidants, chelating agents, antimicrobial preservatives, antifungalpreservatives, alcohol preservatives, acidic preservatives, and/or otherpreservatives. Exemplary antioxidants include, but are not limited to,alpha tocopherol, ascorbic acid, acorbyl palmitate, butylatedhydroxyanisole, butylated hydroxytoluene, monothioglycerol, potassiummetabisulfite, propionic acid, propyl gallate, sodium ascorbate, sodiumbisulfite, sodium metabisulfite, and/or sodium sulfite. Exemplarychelating agents include ethylenediaminetetraacetic acid (EDTA), citricacid monohydrate, disodium edetate, dipotassium edetate, edetic acid,fumaric acid, malic acid, phosphoric acid, sodium edetate, tartaricacid, and/or trisodium edetate. Exemplary antimicrobial preservativesinclude, but are not limited to, benzalkonium chloride, benzethoniumchloride, benzyl alcohol, bronopol, cetrimide, cetylpyridinium chloride,chlorhexidine, chlorobutanol, chlorocresol, chloroxylenol, cresol, ethylalcohol, glycerin, hexetidine, imidurea, phenol, phenoxyethanol,phenylethyl alcohol, phenylmercuric nitrate, propylene glycol, and/orthimerosal. Exemplary antifungal preservatives include, but are notlimited to, butyl paraben, methyl paraben, ethyl paraben, propylparaben, benzoic acid, hydroxybenzoic acid, potassium benzoate,potassium sorbate, sodium benzoate, sodium propionate, and/or sorbicacid. Exemplary alcohol preservatives include, but are not limited to,ethanol, polyethylene glycol, phenol, phenolic compounds, bisphenol,chlorobutanol, hydroxybenzoate, and/or phenylethyl alcohol. Exemplaryacidic preservatives include, but are not limited to, vitamin A, vitaminC, vitamin E, beta-carotene, citric acid, acetic acid, dehydroaceticacid, ascorbic acid, sorbic acid, and/or phytic acid. Otherpreservatives include, but are not limited to, tocopherol, tocopherolacetate, deteroxime mesylate, cetrimide, butylated hydroxyanisol (BHA),butylated hydroxytoluened (BHT), ethylenediamine, sodium lauryl sulfate(SLS), sodium lauryl ether sulfate (SLES), sodium bisulfite, sodiummetabisulfite, potassium sulfite, potassium metabisulfite, GlydantPlus®, Phenonip®, methylparaben, Germall®115, Germaben®II, Neolone™,Kathon™, and/or Euxyl®.

Exemplary buffering agents include, but are not limited to, citratebuffer solutions, acetate buffer solutions, phosphate buffer solutions,ammonium chloride, calcium carbonate, calcium chloride, calcium citrate,calcium glubionate, calcium gluceptate, calcium gluconate, D-gluconicacid, calcium glycerophosphate, calcium lactate, propanoic acid, calciumlevulinate, pentanoic acid, dibasic calcium phosphate, phosphoric acid,tribasic calcium phosphate, calcium hydroxide phosphate, potassiumacetate, potassium chloride, potassium gluconate, potassium mixtures,dibasic potassium phosphate, monobasic potassium phosphate, potassiumphosphate mixtures, sodium acetate, sodium bicarbonate, sodium chloride,sodium citrate, sodium lactate, dibasic sodium phosphate, monobasicsodium phosphate, sodium phosphate mixtures, tromethamine, magnesiumhydroxide, aluminum hydroxide, alginic acid, pyrogen-free water,isotonic saline, Ringer's solution, ethyl alcohol, etc., and/orcombinations thereof.

Exemplary lubricating agents include, but are not limited to, magnesiumstearate, calcium stearate, stearic acid, silica, talc, malt, glycerylbehanate, hydrogenated vegetable oils, polyethylene glycol, sodiumbenzoate, sodium acetate, sodium chloride, leucine, magnesium laurylsulfate, sodium lauryl sulfate, etc., and combinations thereof.

Exemplary oils include, but are not limited to, almond, apricot kernel,avocado, babassu, bergamot, black current seed, borage, cade, camomile,canola, caraway, carnauba, castor, cinnamon, cocoa butter, coconut, codliver, coffee, corn, cotton seed, emu, eucalyptus, evening primrose,fish, flaxseed, geraniol, gourd, grape seed, hazel nut, hyssop,isopropyl myristate, jojoba, kukui nut, lavandin, lavender, lemon,litsea cubeba, macademia nut, mallow, mango seed, meadowfoam seed, mink,nutmeg, olive, orange, orange roughy, palm, palm kernel, peach kernel,peanut, poppy seed, pumpkin seed, rapeseed, rice bran, rosemary,safflower, sandalwood, sasquana, savoury, sea buckthorn, sesame, sheabutter, silicone, soybean, sunflower, tea tree, thistle, tsubaki,vetiver, walnut, and wheat germ oils. Exemplary oils include, but arenot limited to, butyl stearate, caprylic triglyceride, caprictriglyceride, cyclomethicone, diethyl sebacate, dimethicone 360,isopropyl myristate, mineral oil, octyldodecanol, oleyl alcohol,silicone oil, and/or combinations thereof.

Liquid dosage forms for oral and parenteral administration include, butare not limited to, pharmaceutically acceptable emulsions,microemulsions, solutions, suspensions, syrups, and/or elixirs. Inaddition to active ingredients, liquid dosage forms may comprise inertdiluents commonly used in the art such as, for example, water or othersolvents, solubilizing agents and emulsifiers such as ethyl alcohol,isopropyl alcohol, ethyl carbonate, ethyl acetate, benzyl alcohol,benzyl benzoate, propylene glycol, 1,3-butylene glycol,dimethylformamide, oils (in particular, cottonseed, groundnut, corn,germ, olive, castor, and sesame oils), glycerol, tetrahydrofurfurylalcohol, polyethylene glycols and fatty acid esters of sorbitan, andmixtures thereof. Besides inert diluents, oral compositions can includeadjuvants such as wetting agents, emulsifying and suspending agents,sweetening, flavoring, and/or perfuming agents. In certain embodimentsfor parenteral administration, compositions are mixed with solubilizingagents such as Cremophor®, alcohols, oils, modified oils, glycols,polysorbates, cyclodextrins, polymers, and/or combinations thereof.

Injectable preparations, for example, sterile injectable aqueous oroleaginous suspensions may be formulated according to the known artusing suitable dispersing agents, wetting agents, and/or suspendingagents. Sterile injectable preparations may be sterile injectablesolutions, suspensions, and/or emulsions in nontoxic parenterallyacceptable diluents and/or solvents, for example, as a solution in1,3-butanediol. Among the acceptable vehicles and solvents that may beemployed are water, Ringer's solution, U.S.P., and isotonic sodiumchloride solution. Sterile, fixed oils are conventionally employed as asolvent or suspending medium. For this purpose any bland fixed oil canbe employed including synthetic mono- or diglycerides. Fatty acids suchas oleic acid can be used in the preparation of injectables.

Injectable formulations can be sterilized, for example, by filtrationthrough a bacterial-retaining filter, and/or by incorporatingsterilizing agents in the form of sterile solid compositions which canbe dissolved or dispersed in sterile water or other sterile injectablemedium prior to use.

In order to prolong the effect of an active ingredient, it is oftendesirable to slow the absorption of the active ingredient fromsubcutaneous or intramuscular injection. This may be accomplished by theuse of a liquid suspension of crystalline or amorphous material withpoor water solubility. The rate of absorption of the drug then dependsupon its rate of dissolution which, in turn, may depend upon crystalsize and crystalline form. Alternatively, delayed absorption of aparenterally administered drug form is accomplished by dissolving orsuspending the drug in an oil vehicle. Injectable depot forms are madeby forming microencapsule matrices of the drug in biodegradable polymerssuch as polylactide-polyglycolide. Depending upon the ratio of drug topolymer and the nature of the particular polymer employed, the rate ofdrug release can be controlled. Examples of other biodegradable polymersinclude poly(orthoesters) and poly(anhydrides). Depot injectableformulations are prepared by entrapping the drug in liposomes ormicroemulsions which are compatible with body tissues.

Compositions for rectal or vaginal administration are typicallysuppositories which can be prepared by mixing compositions with suitablenon-irritating excipients such as cocoa butter, polyethylene glycol or asuppository wax which are solid at ambient temperature but liquid atbody temperature and therefore melt in the rectum or vaginal cavity andrelease the active ingredient.

Solid dosage forms for oral administration include capsules, tablets,pills, powders, and granules. In such solid dosage forms, an activeingredient is mixed with at least one inert, pharmaceutically acceptableexcipient such as sodium citrate or dicalcium phosphate and/or fillersor extenders (e.g. starches, lactose, sucrose, glucose, mannitol, andsilicic acid), binders (e.g. carboxymethylcellulose, alginates, gelatin,polyvinylpyrrolidinone, sucrose, and acacia), humectants (e.g.glycerol), disintegrating agents (e.g. agar, calcium carbonate, potatoor tapioca starch, alginic acid, certain silicates, and sodiumcarbonate), solution retarding agents (e.g. paraffin), absorptionaccelerators (e.g. quaternary ammonium compounds), wetting agents (e.g.cetyl alcohol and glycerol monostearate), absorbents (e.g. kaolin andbentonite clay), and lubricants (e.g. talc, calcium stearate, magnesiumstearate, solid polyethylene glycols, sodium lauryl sulfate), andmixtures thereof. In the case of capsules, tablets and pills, the dosageform may comprise buffering agents.

Solid compositions of a similar type may be employed as fillers in softand hard-filled gelatin capsules using such excipients as lactose ormilk sugar as well as high molecular weight polyethylene glycols and thelike. Solid dosage forms of tablets, dragees, capsules, pills, andgranules can be prepared with coatings and shells such as entericcoatings and other coatings well known in the pharmaceutical formulatingart. They may optionally comprise opacifying agents and can be of acomposition that they release the active ingredient(s) only, orpreferentially, in a certain part of the intestinal tract, optionally,in a delayed manner. Examples of embedding compositions which can beused include polymeric substances and waxes. Solid compositions of asimilar type may be employed as fillers in soft and hard-filled gelatincapsules using such excipients as lactose or milk sugar as well as highmolecular weight polyethylene glycols and the like.

Dosage forms for topical and/or transdermal administration of acomposition may include ointments, pastes, creams, lotions, gels,powders, solutions, sprays, inhalants and/or patches. Generally, anactive ingredient is admixed under sterile conditions with apharmaceutically acceptable excipient and/or any needed preservativesand/or buffers as may be required. Additionally, the present inventioncontemplates the use of transdermal patches, which often have the addedadvantage of providing controlled delivery of a compound to the body.Such dosage forms may be prepared, for example, by dissolving and/ordispensing the compound in the proper medium. Alternatively oradditionally, rate may be controlled by either providing a ratecontrolling membrane and/or by dispersing the compound in a polymermatrix and/or gel.

Suitable devices for use in delivering intradermal pharmaceuticalcompositions described herein include short needle devices such as thosedescribed in U.S. Pat. Nos. 4,886,499; 5,190,521; 5,328,483; 5,527,288;4,270,537; 5,015,235; 5,141,496; and 5,417,662. Intradermal compositionsmay be administered by devices which limit the effective penetrationlength of a needle into the skin, such as those described in PCTpublication WO 99/34850 and functional equivalents thereof. Jetinjection devices which deliver liquid compositions to the dermis via aliquid jet injector and/or via a needle which pierces the stratumcorneum and produces a jet which reaches the dermis are suitable. Jetinjection devices are described, for example, in U.S. Pat. Nos.5,480,381; 5,599,302; 5,334,144; 5,993,412; 5,649,912; 5,569,189;5,704,911; 5,383,851; 5,893,397; 5,466,220; 5,339,163; 5,312,335;5,503,627; 5,064,413; 5,520,639; 4,596,556; 4,790,824; 4,941,880;4,940,460; and PCT publications WO 97/37705 and WO 97/13537. Ballisticpowder/particle delivery devices which use compressed gas to acceleratevaccine in powder form through the outer layers of the skin to thedermis are suitable. Alternatively or additionally, conventionalsyringes may be used in the classical mantoux method of intradermaladministration.

Formulations suitable for topical administration include, but are notlimited to, liquid and/or semi liquid preparations such as liniments,lotions, oil in water and/or water in oil emulsions such as creams,ointments and/or pastes, and/or solutions and/or suspensions.Topically-administrable formulations may, for example, comprise fromabout 1% to about 10% (w/w) active ingredient, although theconcentration of active ingredient may be as high as the solubilitylimit of the active ingredient in the solvent. Formulations for topicaladministration may further comprise one or more of the additionalingredients described herein.

A pharmaceutical composition may be prepared, packaged, and/or sold in aformulation suitable for pulmonary administration via the buccal cavity.Such a formulation may comprise dry particles which comprise the activeingredient and which have a diameter in the range from about 0.5 nm toabout 7 nm or from about 1 nm to about 6 nm. Such compositions areconveniently in the form of dry powders for administration using adevice comprising a dry powder reservoir to which a stream of propellantmay be directed to disperse the powder and/or using a self propellingsolvent/powder dispensing container such as a device comprising theactive ingredient dissolved and/or suspended in a low-boiling propellantin a sealed container. Such powders comprise particles wherein at least98% of the particles by weight have a diameter greater than 0.5 nm andat least 95% of the particles by number have a diameter less than 7 nm.Alternatively, at least 95% of the particles by weight have a diametergreater than 1 nm and at least 90% of the particles by number have adiameter less than 6 nm. Dry powder compositions may include a solidfine powder diluent such as sugar and are conveniently provided in aunit dose form.

Low boiling propellants generally include liquid propellants having aboiling point of below 65° F. at atmospheric pressure. Generally thepropellant may constitute 50% to 99.9% (w/w) of the composition, andactive ingredient may constitute 0.1% to 20% (w/w) of the composition. Apropellant may further comprise additional ingredients such as a liquidnon-ionic and/or solid anionic surfactant and/or a solid diluent (whichmay have a particle size of the same order as particles comprising theactive ingredient).

Pharmaceutical compositions formulated for pulmonary delivery mayprovide an active ingredient in the form of droplets of a solutionand/or suspension. Such formulations may be prepared, packaged, and/orsold as aqueous and/or dilute alcoholic solutions and/or suspensions,optionally sterile, comprising active ingredient, and may convenientlybe administered using any nebulization and/or atomization device. Suchformulations may further comprise one or more additional ingredientsincluding, but not limited to, a flavoring agent such as saccharinsodium, a volatile oil, a buffering agent, a surface active agent,and/or a preservative such as methylhydroxybenzoate. Droplets providedby this route of administration may have an average diameter in therange from about 0.1 nm to about 200 nm.

Formulations described herein as being useful for pulmonary delivery areuseful for intranasal delivery of a pharmaceutical composition. Anotherformulation suitable for intranasal administration is a coarse powdercomprising the active ingredient and having an average particle fromabout 0.2 μm to 500 μm. Such a formulation is administered in the mannerin which snuff is taken, i.e. by rapid inhalation through the nasalpassage from a container of the powder held close to the nose.

Formulations suitable for nasal administration may, for example,comprise from about as little as 0.1% (w/w) and as much as 100% (w/w) ofactive ingredient, and may comprise one or more of the additionalingredients described herein. A pharmaceutical composition may beprepared, packaged, and/or sold in a formulation suitable for buccaladministration. Such formulations may, for example, be in the form oftablets and/or lozenges made using conventional methods, and may, forexample, 0.1% to 20% (w/w) active ingredient, the balance comprising anorally dissolvable and/or degradable composition and, optionally, one ormore of the additional ingredients described herein. Alternately,formulations suitable for buccal administration may comprise a powderand/or an aerosolized and/or atomized solution and/or suspensioncomprising active ingredient. Such powdered, aerosolized, and/oraerosolized formulations, when dispersed, may have an average particleand/or droplet size in the range from about 0.1 nm to about 200 nm, andmay further comprise one or more of any additional ingredients describedherein.

A pharmaceutical composition may be prepared, packaged, and/or sold in aformulation suitable for ophthalmic administration. Such formulationsmay, for example, be in the form of eye drops including, for example, a0.1/1.0% (w/w) solution and/or suspension of the active ingredient in anaqueous or oily liquid excipient. Such drops may further comprisebuffering agents, salts, and/or one or more other of any additionalingredients described herein. Other opthalmically-administrableformulations which are useful include those which comprise the activeingredient in microcrystalline form and/or in a liposomal preparation.Ear drops and/or eye drops are contemplated as being within the scope ofthis invention.

General considerations in the formulation and/or manufacture ofpharmaceutical agents may be found, for example, in Remington: TheScience and Practice of Pharmacy 21^(st) ed., Lippincott Williams &Wilkins, 2005 (incorporated herein by reference).

Administration

The present invention provides methods comprising administeringsupercharged proteins or complexes in accordance with the invention to asubject in need thereof. Supercharged proteins or complexes, orpharmaceutical, imaging, diagnostic, or prophylactic compositionsthereof, may be administered to a subject using any amount and any routeof administration effective for preventing, treating, diagnosing, orimaging a disease, disorder, and/or condition (e.g., a disease,disorder, and/or condition relating to working memory deficits). Theexact amount required will vary from subject to subject, depending onthe species, age, and general condition of the subject, the severity ofthe disease, the particular composition, its mode of administration, itsmode of activity, and the like. Compositions in accordance with theinvention are typically formulated in dosage unit form for ease ofadministration and uniformity of dosage. It will be understood, however,that the total daily usage of the compositions of the present inventionwill be decided by the attending physician within the scope of soundmedical judgment. The specific therapeutically effective,prophylactically effective, or appropriate imaging dose level for anyparticular patient will depend upon a variety of factors including thedisorder being treated and the severity of the disorder; the activity ofthe specific compound employed; the specific composition employed; theage, body weight, general health, sex and diet of the patient; the timeof administration, route of administration, and rate of excretion of thespecific compound employed; the duration of the treatment; drugs used incombination or coincidental with the specific compound employed; andlike factors well known in the medical arts.

Supercharged proteins or complexes comprising supercharged proteinsassociated with at least one agent to be delivered and/orpharmaceutical, prophylactic, diagnostic, or imaging compositionsthereof may be administered to animals, such as mammals (e.g., humans,domesticated animals, cats, dogs, mice, rats, etc.). In someembodiments, supercharged proteins or complexes and/or pharmaceutical,prophylactic, diagnostic, or imaging compositions thereof areadministered to humans.

Supercharged proteins or complexes comprising supercharged proteinsassociated with at least one agent to be delivered and/orpharmaceutical, prophylactic, diagnostic, or imaging compositionsthereof in accordance with the present invention may be administered byany route. In some embodiments, supercharged proteins or complexes,and/or pharmaceutical, prophylactic, diagnostic, or imaging compositionsthereof, are administered by one or more of a variety of routes,including oral, intravenous, intramuscular, intra-arterial,intramedullary, intrathecal, subcutaneous, intraventricular,transdermal, interdermal, rectal, intravaginal, intraperitoneal, topical(e.g. by powders, ointments, creams, gels, lotions, and/or drops),mucosal, nasal, buccal, enteral, vitreal, intratumoral, sublingual; byintratracheal instillation, bronchial instillation, and/or inhalation;as an oral spray, nasal spray, and/or aerosol, and/or through a portalvein catheter. In some embodiments, supercharged proteins or complexes,and/or pharmaceutical, prophylactic, diagnostic, or imaging compositionsthereof, are administered by systemic intravenous injection. In specificembodiments, supercharged proteins or complexes and/or pharmaceutical,prophylactic, diagnostic, or imaging compositions thereof may beadministered intravenously and/or orally. In specific embodiments,supercharged proteins or complexes, and/or pharmaceutical, prophylactic,diagnostic, or imaging compositions thereof, may be administered in away which allows the supercharged protein or complex to cross theblood-brain barrier, vascular barrier, or other epithelial barrier.

However, the invention encompasses the delivery of supercharged proteinsor complexes, and/or pharmaceutical, prophylactic, diagnostic, orimaging compositions thereof, by any appropriate route taking intoconsideration likely advances in the sciences of drug delivery.

In general the most appropriate route of administration will depend upona variety of factors including the nature of the supercharged protein orcomplex comprising supercharged proteins associated with at least oneagent to be delivered (e.g., its stability in the environment of thegastrointestinal tract, bloodstream, etc.), the condition of the patient(e.g., whether the patient is able to tolerate particular routes ofadministration), etc. The invention encompasses the delivery of thepharmaceutical, prophylactic, diagnostic, or imaging compositions by anyappropriate route taking into consideration likely advances in thesciences of drug delivery.

In certain embodiments, compositions in accordance with the inventionmay be administered at dosage levels sufficient to deliver from about0.0001 mg/kg to about 100 mg/kg, from about 0.01 mg/kg to about 50mg/kg, from about 0.1 mg/kg to about 40 mg/kg, from about 0.5 mg/kg toabout 30 mg/kg, from about 0.01 mg/kg to about 10 mg/kg, from about 0.1mg/kg to about 10 mg/kg, or from about 1 mg/kg to about 25 mg/kg, ofsubject body weight per day, one or more times a day, to obtain thedesired therapeutic, diagnostic, prophylactic, or imaging effect. Thedesired dosage may be delivered three times a day, two times a day, oncea day, every other day, every third day, every week, every two weeks,every three weeks, or every four weeks. In certain embodiments, thedesired dosage may be delivered using multiple administrations (e.g.,two, three, four, five, six, seven, eight, nine, ten, eleven, twelve,thirteen, fourteen, or more administrations).

Supercharged proteins or complexes comprising supercharged proteinsassociated with at least one agent to be delivered may be used incombination with one or more other therapeutic, prophylactic,diagnostic, or imaging agents. By “in combination with,” it is notintended to imply that the agents must be administered at the same timeand/or formulated for delivery together, although these methods ofdelivery are within the scope of the invention. Compositions can beadministered concurrently with, prior to, or subsequent to, one or moreother desired therapeutics or medical procedures. In general, each agentwill be administered at a dose and/or on a time schedule determined forthat agent. In some embodiments, the invention encompasses the deliveryof pharmaceutical, prophylactic, diagnostic, or imaging compositions incombination with agents that may improve their bioavailability, reduceand/or modify their metabolism, inhibit their excretion, and/or modifytheir distribution within the body.

In will further be appreciated that therapeutically, prophylactically,diagnostically, or imaging active agents utilized in combination may beadministered together in a single composition or administered separatelyin different compositions. In general, it is expected that agentsutilized in combination with be utilized at levels that do not exceedthe levels at which they are utilized individually. In some embodiments,the levels utilized in combination will be lower than those utilizedindividually.

The particular combination of therapies (therapeutics or procedures) toemploy in a combination regimen will take into account compatibility ofthe desired therapeutics and/or procedures and the desired therapeuticeffect to be achieved. It will also be appreciated that the therapiesemployed may achieve a desired effect for the same disorder (forexample, a composition useful for treating cancer in accordance with theinvention may be administered concurrently with a chemotherapeuticagent), or they may achieve different effects (e.g., control of anyadverse effects).

Kits

The invention provides a variety of kits for conveniently and/oreffectively carrying out methods of the present invention. Typicallykits will comprise sufficient amounts and/or numbers of components toallow a user to perform multiple treatments of a subject(s) and/or toperform multiple experiments.

In some embodiments, kits comprise one or more of (i) a superchargedprotein, as described herein; (ii) an agent to be delivered; (iii)instructions for forming complexes comprising supercharged proteinsassociated with at least one agent.

In some embodiments, kits comprise one or more of (i) a superchargedprotein, as described herein; (ii) a nucleic acid; (iii) instructionsfor forming complexes comprising supercharged proteins associated withat least one nucleic acid.

In some embodiments, kits comprise one or more of (i) a superchargedprotein, as described herein; (ii) a peptide or protein; (iii)instructions for forming complexes comprising supercharged proteinsassociated with at least one peptide or protein to be delivered.

In some embodiments, kits comprise one or more of (i) a superchargedprotein, as described herein; (ii) a small molecule; (iii) instructionsfor forming complexes comprising supercharged proteins associated withat least one small molecule.

In some embodiments, kits comprise one or more of (i) a superchargedprotein or complex comprising supercharged proteins associated with atleast one agent to be delivered, as described herein; (ii) at least onepharmaceutically acceptable excipient; (iii) a syringe, needle,applicator, etc. for administration of a pharmaceutical, prophylactic,diagnostic, or imaging composition to a subject; and (iv) instructionsfor preparing pharmaceutical composition and for administration of thecomposition to the subject.

In some embodiments, kits comprise one or more of (i) a pharmaceuticalcomposition comprising a supercharged protein or complex comprisingsupercharged proteins associated with at least one agent to bedelivered, as described herein; (ii) a syringe, needle, applicator, etc.for administration of the pharmaceutical, prophylactic, diagnostic, orimaging composition to a subject; and (iii) instructions foradministration of the pharmaceutical, prophylactic, diagnostic, orimaging composition to the subject.

In some embodiments, kits comprise one or more components useful formodifying proteins of interest to produce supercharged proteins. Thesekits typically include all or most of the reagents needed createsupercharged proteins. In certain embodiments, such a kit includescomputer software to aid a researcher in designing a superchargedprotein in accordance with the invention. In certain embodiments, such akit includes reagents necessary for performing site-directedmutagenesis.

In some embodiments, kits may include additional components or reagents.For example, kits may comprise buffers, reagents, primers,oligonucleotides, nucleotides, enzymes, buffers, cells, media, plates,tubes, instructions, vectors, etc. In some embodiments, kits maycomprise instructions for use.

In some embodiments, kits include a number of unit dosages of apharmaceutical, prophylactic, diagnostic, or imaging compositioncomprising supercharged proteins or complexes comprising superchargedproteins and at least one agent to be delivered. A memory aid may beprovided, for example in the form of numbers, letters, and/or othermarkings and/or with a calendar insert, designating the days/times inthe treatment schedule in which dosages can be administered. Placebodosages, and/or calcium dietary supplements, either in a form similar toor distinct from the dosages of the pharmaceutical, prophylactic,diagnostic, or imaging compositions, may be included to provide a kit inwhich a dosage is taken every day.

Kits may comprise one or more vessels or containers so that certain ofthe individual components or reagents may be separately housed. Kits maycomprise a means for enclosing individual containers in relatively closeconfinement for commercial sale (e.g., a plastic box in whichinstructions, packaging materials such as styrofoam, etc., may beenclosed). Kit contents are typically packaged for convenience use in alaboratory.

These and other aspects of the present invention will be furtherappreciated upon consideration of the following Examples, which areintended to illustrate certain particular embodiments of the inventionbut are not intended to limit its scope, as defined by the claims.

EXAMPLES Example 1 Supercharging Proteins can Impart ExtraordinaryResilience Materials and Methods Design Procedure and SuperchargedProtein Sequences

Solvent-exposed residues (shown in grey below) were identified frompublished structural data (Weber et al., 1989, Science, 243:85; Dirr etal., 1994, J. Mol. Biol., 243:72; Pedelacq et al., 2006, Nat.Biotechnol., 24:79; each of which is incorporated herein by reference)as those having AvNAPSA<150, where AvNAPSA is average neighbor atoms(within 10 Å) per sidechain atom. Charged or highly polarsolvent-exposed residues (DERKNQ) were mutated either to Asp or Glu, fornegative-supercharging; or to Lys or Arg, for positive-supercharging.Additional surface-exposed positions to mutate in green fluorescentprotein (GFP) variants were chosen on the basis of sequence variabilityat these positions among GFP homologues.

Protein Expression and Purification

Synthetic genes optimized for E. coli codon usage were purchased fromDNA 2.0, cloned into a pET expression vector (Novagen), andoverexpressed in E. coli BL21(DE3)pLysS for 5-10 hours at 15° C. Cellswere harvested by centrifugation and lysed by sonication. Proteins werepurified by Ni-NTA agarose chromatography (Qiagen), buffer-exchangedinto 100 mM NaCl, 50 mM potassium phosphate pH 7.5, and concentrated byultrafiltration (Millipore). All GFP variants were purified under nativeconditions.

Electrostatic Surface Potential Calculations (FIG. 1B-D)

Models of −30 and +48 supercharged GFP variants were based on thecrystal structure of superfolder GFP (Pedelacq et al., 2006, Nat.Biotechnol., 24:79; incorporated herein by reference). Electrostaticpotentials were calculated using APBS (Baker et al., 2001, Proc. Natl.Acad. Sci., USA, 98:10037; incorporated herein by reference) andrendered with PyMol (Delano, 2002, The PyMOL Molecular Graphics System,www.pymol.org; incorporated herein by reference) using a scale of −25kT/e (red) to +25 kT/e (blue).

Protein Staining and UV-Induced Fluorescence (FIG. 2A)

0.2 μg of each GFP variant was analyzed by electrophoresis in a 10%denaturing polyacrylamide gel and stained with Coomassie brilliant bluedye. 0.2 μg of the same protein samples in 25 mM Tris pH 8.0 with 100 mMNaCl was placed in a 0.2 mL Eppendorf tube and photographed under UVlight (360 nm).

Thermal Denaturation and Aggregation (FIG. 3A)

Purified GFP variants were diluted to 2 mg/mL in 25 mM Tris pH 8.0, 100mM NaCl, and 10 mM beta-mercaptoethanol (BME), then photographed underUV illumination (“native”). The samples were heated to 100° C. for 1minute, then photographed again under UV illumination (“boiled”).Finally, the samples were cooled 2 hours at room temperature andphotographed again under UV illumination (“cooled”).

Chemically Induced Aggregation (FIG. 3B)

2,2,2-trifluoroethanol (TFE) was added to produce solutions with 1.5mg/mL protein, 25 mM Tris pH 7.0, 10 mM BME, and 40% TFE. Aggregation at25° C. was monitored by right-angle light scattering.

Size-Exclusion Chromatography (Table 4)

The multimeric state of GFP variants was determined by analyzing 20-50μg of protein on a Superdex 75 gel-filtration column. Buffer was 100 mMNaCl, 50 mM potassium phosphate pH 7.5. Molecular weights weredetermined by comparison with a set of monomeric protein standards ofknown molecular weights analyzed separately under identical conditions.

TABLE 4 Calculated and experimentally determined protein properties.name MW (kD) length (aa) n_(pos) n_(neg) n_(charged) Q_(net) pI ΔG(kcal/mol)^(a) native MW (kD)^(b) % soluble after boiling^(c) GFP (−30)27.8 248 19 49 68 −30 4.8 10.2 n.d. 98 GFP (−25) 27.8 248 21 46 67 −255.0 n.d. n.d. n.d. sfGFP 27.8 248 27 34 61 −7 6.6 11.2 n.d. 4 GFP (+36)28.5 248 56 20 76 +36 10.4 8.8 n.d. 97 GFP (+48) 28.6 248 63 15 78 +4810.8 7.1 n.d. n.d. n_(pos), number of positively charged amino acids(per monomer) n_(neg), number of negatively charged amino acidsn_(charged), total number of charged amino acids Q_(net), theoreticalnet charge at neutral pH pI, calculated isoelectric point n.d., notdetermined ^(a)measured by guanidinium denaturation (FIG. 2C).^(b)measured by size-exclusion chromatography. ^(c)percent proteinremaining in supernatant after 5 min at 100° C., cooling to 25° C., andbrief centrifugation.

Supercharged GFP

A variant of green fluorescent protein (GFP) called “superfolder GFP”(sfGFP) has been highly optimized for folding efficiency and resistanceto denaturants (Pedelacq et al., 2006, Nat. Biotechnol., 24:79;incorporated herein by reference). Superfolder GFP has a net charge of−7, similar to that of wild-type GFP. Guided by a simple algorithm tocalculate solvent exposure of amino acids (see Materials and Methods), asupercharged variant of GFP was designed. Supercharged GFP has atheoretical net charge of +36 and was created by mutating 29 of its mostsolvent-exposed residues to positively charged amino acids (FIG. 1). Theexpression of genes encoding either sfGFP or supercharged GFP(“GFP(+36)”) yielded intensely green-fluorescent bacteria. Followingprotein purification, the fluorescence properties of GFP(+36) weremeasured and found to be very similar to those of sfGFP.

Additional supercharged GFPs having net charges of +48, −25, and −30were designed and purified, all of which were also found to exhibitsfGFP-like fluorescence (FIG. 2A). All supercharged GFP variants showedcircular dichroism spectra similar to that of sfGFP, indicating that theproteins have similar secondary structure content (FIG. 2B). Thethermodynamic stabilities of the supercharged GFP variants were onlymodestly lower than that of sfGFP (1.0-4.1 kcal/mol, FIG. 2C and Table4) despite the presence of as many as 36 mutations.

Although sfGFP is the product of a long history of GFP optimization(Giepmans et al., 2006, Science, 312:217; incorporated herein byreference), it remains susceptible to aggregation induced by thermal orchemical unfolding. Heating sfGFP to 100° C. induced its quantitativeprecipitation and the irreversible loss of fluorescence (FIG. 3A). Incontrast, supercharged GFP(+36) and GFP(−30) remained soluble whenheated to 100° C., and recovered significant fluorescence upon cooling(FIG. 3A). While 40% 2,2,2-trifluoroethanol (TFE) induced the completeaggregation of sfGFP at 25° C. within minutes, the +36 and −30supercharged GFP variants suffered no significant aggregation or loss offluorescence under the same conditions for hours (FIG. 3B).

Supercharged GFP variants show a strong, reversible avidity for highlycharged macromolecules of the opposite charge (FIG. 3C). When mixedtogether in 1:1 stoichiometry, GFP(+36) and GFP(−30) immediately formeda green fluorescent co-precipitate, indicating the association of foldedproteins. GFP(+36) similarly co-precipitated with high concentrations ofRNA or DNA. Addition of NaCl was sufficient to dissolve these complexes,consistent with the electrostatic basis of their formation. In contrast,sfGFP was unaffected by the addition of GFP(−30), RNA, or DNA (FIG. 3C).

Conclusion

In summary, monomeric and multimeric proteins of varying structures andfunctions can be “supercharged” by simply replacing their mostsolvent-exposed residues with like-charged amino acids. Superchargingprofoundly alters the intermolecular properties of proteins, impartingremarkable aggregation resistance and the ability to associate in foldedform with oppositely charged macromolecules like “molecular Velcro.”

In contrast to these dramatic intermolecular effects, the intramolecularproperties of the seven supercharged proteins studied here, includingfolding, fluorescence, ligand binding, and enzymatic catalysis, remainedlargely intact. Supercharging therefore may represent a useful approachfor reducing the aggregation tendency and improving the solubility ofproteins without abolishing their function. These principles may beparticularly useful in de novo protein design efforts, whereunpredictable protein handling properties including aggregation remain asignificant challenge.

These observations may also illuminate the modest net-chargedistribution of natural proteins (Knight et al., 2004, Proc. Natl. Acad.Sci., USA, 101:8390; Gitlin et al., 2006, Angew Chem Int Ed Engl,45:3022; each of which is incorporated herein by reference): the netcharge of 84% of Protein Data Bank (PDB) polypeptides, for example,falls within ±10. The results above argue against the hypothesis thathigh net charge creates sufficient electrostatic repulsion to forceunfolding. Indeed, GFP(+48) has a higher positive net charge than anypolypeptide currently in the PDB, yet retains the ability to fold andfluoresce. Instead, these findings suggest that nonspecificintermolecular adhesions may have disfavored the evolution of too manyhighly charged natural proteins. Almost all natural proteins with veryhigh net charge, such as ribosomal proteins L3 (+36) and L15 (+44),which bind RNA, or calsequestrin (−80), which binds calcium cations,associate with oppositely charged species as part of their essentialcellular functions.

Example 2 Supercharged Proteins can be Used to Efficiently DeliverNucleic Acids to Cells

FIG. 5 demonstrates that supercharged GFPs associate non-specificallyand reversibly with oppositely charged macromolecules (“proteinVelcro”). Such interactions can result in the formation of precipitates.Unlike aggregates of denatured proteins, these precipitates containfolded, fluorescent GFP and dissolve in 1 M salt. Shown here are: +36GFP alone; +36 GFP mixed with −30 GFP; +36 GFP mixed with tRNA; +36 GFPmixed with tRNA in 1 M NaCl; superfolder GFP (“sf GFP”; −7 GFP); andsfGFP mixed with −30 GFP.

FIG. 6 demonstrates that superpositively charged GFP binds siRNA. Thebinding stoichiometry between +36 GFP and siRNA was determined by mixingvarious ratios of the two components (30 minutes at 25° C.) and runningthe mixture on a 3% agarose gel (Kumar et al., 2007, Nature, 449:39;incorporated herein by reference). Ratios of +36 GFP:siRNA tested were0:1, 1:1, 1:2, 1:3, 1:4, 1:5, and 1:10. +36 GFP/siRNA complexes did notco-migrate with siRNA in an agarose gel. +36 GFP was shown to form astable complex with siRNA in a ˜1:3 stoichiometry, indicating that onesupercharged GFP binds approximately three siRNA molecules. Thisproperty allows the application of low quantities of superpositivelycharged GFP to deliver siRNA effectively to cells. Moreover, because thedelivery reagent is fluorescent, and therefore observable byfluorescence microscopy, siRNA delivery can be assessed using thisspectroscopic technique. In contrast, non-superpositive proteins did notbind siRNA. A 50:1 ratio of sfGFP:siRNA was also tested, but, even atsuch high levels of excess, sfGFP did not associate with siRNA.

FIG. 7 demonstrates that superpositively charged GFP penetrates cells.HeLa cells were incubated with 1 nM GFP for 3 hours, washed, fixed, andstained. Three GFP variants were tested in this experiment: sf GFP (−7),−30 GFP, and +36 GFP. +36 GFP, but not sfGFP or −30 GFP, was shown topotently penetrate HeLa cells within minutes. Localization was shown tobegin at the cell membrane, becoming punctate and intracellularthereafter. +36 GFP was shown to be stable in HeLa cells for ≧5 days.Results are shown in FIG. 7. On the left is DAPI staining of DNA to markthe position of cells. In the middle is GFP staining to show wherecellular uptake of GFP occurred. On the right is a movie showinglocalization as it occurs.

In order to demonstrate the utility of superpositively charged GFP forsiRNA delivery, siRNA transfection efficiency using Lipofectamine 2000™(Invitrogen), a commonly used and commercially available cationic lipidtransfection reagent, was compared to superpositively charged GFP-basedsiRNA transfection in HeLa cells.

Generally, for a cell culture condition with a total volume of 1 mL,cells are plated to ˜80% confluency in 10% serum/media. The serum/mediasolution is removed, and cells are washed twice with PBS and 500 μL ofserum-free media. In a separate vessel, 500 μL of serum free media isadded, to which 1 μL of 50 μM siRNA solution (total concentration 100nM) and 1.66 μL of 15 μM sc(+36)GFP (total concentration 40 nM) areadded. The contents are mixed by inversion and allowed to incubate for 5minutes. After such time, the mixture is added to the well containing500 μL of serum-free media to give a final concentration of 50 nM siRNAand 20 nM scGFP. This solution is placed in a 37° C. incubator (5% CO₂)for 4 hours, removed, and washed twice with PBS. Cells are then treatedwith 1 mL 10% FBS/media. Cells were allowed to incubate for 4 daysbefore being harvested to determine gene knockdown.

FIG. 8 demonstrates that superpositively charged GFP is able to deliversiRNA into human cells. In particular, +36 GFP was shown to deliversiRNA into HeLa cells. +36 GFP delivered higher quantities of siRNA at amuch higher transfection efficiency than Lipofectamine. HeLa cells weretreated with either: ˜2 μM lipofectamine 2000 and 50 nM (125 pmol)Cy3-siRNA (left); or 30 nM of +36 GFP and 50 nM (125 pmol) Cy3-siRNA(right). Unlike Lipofectamine, +36 GFP did not induce cytotoxicity,particularly upon addition of antibiotics such as penicillin andstreptomycin.

In order to demonstrate the broad utility of supercharged proteins fornucleic acid delivery, this experiment has been repeated in a variety ofcells, including cells that are resistant to cationic lipid-based siRNAtransfection. FIGS. 9-11 demonstrate that superpositively charged GFP isable to deliver siRNA into cell lines that are resistant to traditionaltransfection methods. FIG. 9 demonstrates that superpositively chargedGFP is able to deliver siRNA into 3T3-L₁ pre-adipocyte cells (“3T3Lcells”). 3T3L cells were treated with either: ˜2 μM Lipofectamine 2000and 50 nM (125 pmol) Cy3-siRNA (left); or nM+36 GFP and 50 nM (125 pmol)Cy3-siRNA (right). Murine 3T3-L₁ pre-adipocyte cells were poorlytransfected by Lipofectamine but were efficiently transfected by +36GFP. Hoechst channel, blue, was used to visualize DNA, thereby markingthe position of cells; Cy3 channel, red, was used to visualizeCy3-tagged siRNA; GFP channel, green, was used to visualize GFP. Yellowindicates sites of co-localization between siRNA and GFP. UnlikeLipofectamine, +36 GFP did not induce cytotoxicity, particularly uponaddition of antibiotics such as penicillin and streptomycin.

FIG. 10 demonstrates that superpositively charged GFP is able to deliversiRNA into rat IMCD cells. Rat IMCD cells were treated with either ˜2 μMLipofectamine 2000 and 50 nM (125 pmol) Cy3-siRNA (left); or 20 nM+36GFP and 50 nM (125 pmol) Cy3-siRNA (right). Rat IMCD cells were poorlytransfected by Lipofectamine but were efficiently transfected with +36GFP. Hoechst channel, blue, was used to visualize DNA, thereby markingthe position of cells; Cy3 channel, red, was used to visualizeCy3-tagged siRNA; GFP channel, green, was used to visualize GFP. Yellowindicates sites of co-localization between siRNA and GFP. UnlikeLipofectamine, +36 GFP did not induce cytotoxicity, particularly uponaddition of antibiotics such as penicillin and streptomycin.

FIG. 11 demonstrates that superpositively charged GFP is able to deliversiRNA into human ST14A neurons. Human ST14A neurons were treated witheither ˜2 μM Lipofectamine 2000 and 50 nM (125 pmol) Cy3-siRNA; or 50nM+36 GFP and 50 nM (125 pmol) Cy3-siRNA. Human ST14A neurons wereweakly transfected by Lipofectamine but were efficiently transfected by+36 GFP. DAPI channel, blue, was used to visualize DNA, thereby markingthe position of cells; Cy3 channel, red, was used to visualizeCy3-tagged siRNA; GFP channel, green, was used to visualize GFP. Yellowindicates sites of co-localization between siRNA and GFP. Resultssimilar to those presented in FIGS. 9-11 were observed in two other celltypes that are resistant to traditional transfection methods (i.e.,Jurkat cells and PC12 cells). Unlike Lipofectamine, +36 GFP did notinduce cytotoxicity, particularly upon addition of antibiotics such aspenicillin and streptomycin.

FIG. 13 presents flow cytometry analysis of siRNA transfectionexperiments. Each column corresponds to experiments performed withdifferent transfection methods: Lipofectamine (blue); and 20 nM+36 GFP(red). Each chart corresponds to experiments performed with differentcell types: IMCD cells, PC12 cells, HeLa cells, 3T3L cells, and Jurkatcells. The X-axis represents measurements obtained from the Cy3 channel,which is a readout of siRNA fluorescence. The Y-axis represents cellcount in flow cytometry experiments. Flow cytometry data indicate thatcells were more efficiently transfected with siRNA using +36 GFP thanLipofectamine.

In order to demonstrate the effectiveness of +36 GFP-delivered siRNA tosuppress gene expression, cellular levels of GAPDH were examined bywestern blot. As shown in FIG. 13, +36 GFP effectively delivered siRNAto cells and suppressed GAPDH at levels comparable to that oflipofectamine. 50 nM GAPDH siRNA was transfected into five differentcell types (HeLa, IMCD, 3T3L, PC12, and Jurkat cell lines) using either˜2 μM lipofectamine 2000 (black bars) or 20 nM+36 GFP (green bars). TheY-axis represents GAPDH protein levels as a fraction of tubulin proteinlevels.

FIG. 14 demonstrates the effects of a variety of mechanistic probes ofcell penetration on superpositively charged GFP-mediated siRNAtransfection. HeLa cells were treated with one of a variety of probesfor 30 minutes and were then treated with 5 nM+36 GFP. Cells were thenwashed with heparin +probe and imaged in PBS+probe. Samples included: noprobe; 4° C. preincubation (inhibits energy-dependent processes); 100 mMsucrose (inhibits clathrin-mediated endocytosis); 25 μg/ml nystatin(disrupts caveolar function); 25 μM cytochalisin B (inhibitsmacropinocytosis); and 5 μM monensin (inhibits endosome receptorrecycling). Experiments at 4° C. demonstrated that cell penetration of+36 GFP involves energy consumption. Experiments with sucrose andnystatin demonstrate that cellular uptake of +36 GFP does not involveclathrin-mediated endocytosis or caveolar endocytosis. Experiments withcytochalasin B and monensin demonstrate that cellular uptake of +36 GFPdoes not involve macropinocytosis, but is likely to involve earlyendosomes.

FIG. 15 demonstrates various factors contributing to cell-penetratingactivity. Charge density was shown to contribute to cell-penetratingactivity. For example, 60 nM Arg₆ was shown not to transfect siRNA.Charge magnitude was shown to contribute to cell-penetrating activity.For example, +15 GFP was shown not to penetrate cells or transfectsiRNA. “Protein-like” character was also shown to contribute tocell-penetrating activity. For example, 60 nM Lys₂₀₋₅₀ was shown not totransfect siRNA. The present invention demonstrates that, in someembodiments, charge density is not sufficient to allow a protein topenetrate into cells. The present invention demonstrates that, in somesituations, charge magnitude may necessary but not sufficient to allow aprotein to penetrate into cells. The present invention further showsthat some protein-like features may contribute to cell penetration.

Example 3 Mammalian Cell Penetration, siRNA Transfection, and DNATransfection by Supercharged Green Fluorescent Proteins

Resurfacing proteins without abolishing their structure or functionthrough the extensive mutagenesis of non-conserved, solvent-exposedresidues were previously described (Lawrence M S, Phillips K J, Liu D R(2007). Supercharging proteins can impart unusual resilience. J. Am.Chem. Soc. 129:10110-10112; International PCT patent application,PCT/US07/70254, filed Jun. 1, 2007, published as WO 2007/143574 on Dec.13, 2007; U.S. provisional patent applications, U.S. Ser. No.60/810,364, filed Jun. 2, 2006, and U.S. Ser. No. 60/836,607, filed Aug.9, 2006; each of which is incorporated herein by reference). When thereplacement residues are all positively or all negatively charged, theresulting “supercharged” proteins can retain their activity whilegaining unusual properties such as robust resistance to aggregation andthe ability to bind oppositely charged macromolecules. For example, agreen fluorescent protein with a +36 net theoretical charge (+36 GFP)was highly aggregation-resistant, could retain fluorescence even afterbeing boiled and cooled, and reversibly complexed DNA and RNA throughelectrostatic interactions.

A variety of cationic peptides with the ability to penetrate mammaliancells including peptides derived from HIV Tat (Frankel A D, Pabo C O(1988) Cellular uptake of the tat protein from human immunodeficiencyvirus. Cell 55: 1189-1193; Green M, Loewenstein P M (1988) Autonomousfunctional domains of chemically synthesized human immunodeficiencyvirus tat trans-activator protein. Cell 55: 1179-1188; each of which isincorporated herein by reference) and penetratin from the Antennapediahomeodomain (Thoren P E, Persson D, Karlsson M, Norden B (2000) Theantennapedia peptide penetratin translocates across lipid bilayers—thefirst direct observation. FEBS Lett 482: 265-268; incorporated herein byreference) have been previously described. Schepartz and coworkers haverecently shown that small, folded proteins containing a minimal cationicmotif embedded within a type II polyproline helix efficiently penetrateeukaryotic cells (Daniels D S, Schepartz A (2007) Intrinsicallycell-permeable miniature proteins based on a minimal cationic PPIImotif. J Am Chem Soc 129: 14578-14579; Smith B A, Daniels D S, Coplin AE, Jordan G E, McGregor L M, et al. (2008) Minimally cationiccell-permeable miniature proteins via alpha-helical arginine display. JAm Chem Soc 130: 2948-2949; each of which is incorporated herein byreference). Raines and coworkers recently engineered proteins with asurface-exposed poly-arginine patch that confers the ability topenetrate cells (Fuchs S M, Raines R T (2007) Arginine grafting to endowcell permeability. ACS Chem Biol 2: 167-170; Fuchs S M, Rutkoski T J,Kung V M, Groeschl R T, Raines R T (2007) Increasing the potency of acytotoxin with an arginine graft. Protein Eng Des Sel 20: 505-509; eachof which is incorporated herein by reference). In light of thesestudies, it was suggested that superpositively charged proteins such as+36 GFP might associate with negatively charged components of the cellmembrane in a manner that results in cell penetration.

The present Example describes, inter alia, the cell-penetratingcharacteristics of superpositively charged GFP variants with net chargesof +15, +25, and +36. It was found that +36 GFP potently enters cellsthrough sulfated peptidoglycan-mediated, actin-dependent endocytosis.When pre-mixed with siRNA, +36 GFP delivers siRNA effectively andwithout cytotoxicity into a variety of cell lines, including severalknown to be resistant to cationic lipid-mediated transfection. The siRNAdelivered into cells using +36 GFP was able to effect gene silencing infour out of five mammalian cell lines tested. Comparison of the siRNAtransfection ability of +36 GFP with that of several synthetic peptidesof comparable or greater charge magnitude and charge density suggeststhat the observed mode of siRNA delivery may require protein-likefeatures of +36 GFP that are not present among cationic peptides. Whenfused to an endosomolytic peptide derived from hemagglutinin, +36 GFP isalso able to transfect plasmid DNA into several cell lines that resistcationic lipid-mediated transfection in a manner that enablesplasmid-based gene expression.

Results Mammalian Cell Penetration by Supercharged GFPs.

A series of resurfaced variants of “superfolder GFP” (sfGFP) waspreviously generated and characterized (Pedelacq J D, Cabantous S, TranT, Terwilliger T C, Waldo G S (2006) Engineering and characterization ofa superfolder green fluorescent protein. Nat Biotechnol 24: 79-88;incorporated herein by reference) with theoretical net charges rangingfrom −30 to +48 that retain fluorescence (Lawrence M S, Phillips K J,Liu D R (2007) Supercharging proteins can impart unusual resilience. JAm Chem Soc 129: 10110-10112; incorporated herein by reference). Theevaluation of the ability of these supercharged GFPs to penetratemammalian cells requires a method to remove surface-bound,non-internalized GFP. Wit was, therefore, confirmed that washingconditions known to remove surface-bound cationic proteins from cells(Pedelacq J D, Cabantous S, Tran T, Terwilliger T C, Waldo G S (2006)Engineering and characterization of a superfolder green fluorescentprotein. Nat Biotechnol 24: 79-88) also effectively remove cellsurface-bound superpositively charged GFP. HeLa cells were treated with+36 GFP at 4° C., a temperature that allows +36 GFP to bind to theoutside of cells but blocks internalization (vide infra). Cells werewashed three times at 4° C. with either PBS or with PBS containingheparin and analyzed by flow cytometry for GFP fluorescence. Cellswashed with PBS were found to have significant levels of GFP (presumablysurface-bound), while cells washed with PBS containing heparin exhibitedGFP fluorescence intensity very similar to that of untreated cells (FIG.22). These observations confirmed the effectiveness of three washes withheparin at removing surface-bound superpositively charged GFP.

Next, HeLa cells were incubated with 10-500 nM sfGFP (theoretical netcharge of −7), −30 GFP, +15 GFP, +25 GFP, or +36 GFP for 4 hours at 37°C. (FIG. 16A). After incubation, cells were washed three times with PBScontaining heparin and analyzed by flow cytometry. No detectableinternalized protein was observed in cells treated with sfGFP or −30GFP. HeLa cells treated with +25 GFP or +36 GFP, however, were found tocontain high levels of internalized GFP. In contrast, cells treated with+15 GFP contained 10-fold less internalized GFP, indicating thatpositive charge magnitude is an important determinant of effective cellpenetration (FIG. 16B). It was found that +36 GFP readily penetratesHeLa cells even at concentrations as low as 10 nM (FIG. 23).

In order to test the generality of cell penetration by +36 GFP, theseexperiments were repeated using four additional mammalian cell types:inner medullary collecting duct (IMCD) cells, 3T3-L pre-adipocytes, ratpheochromocytoma PC12 cells, and Jurkat T-cells. Flow cytometry analysisrevealed that 200 nM+36 GFP effectively penetrates all five types ofcells tested (FIG. 16C). Internalization of +36 GFP in stably adherentHeLa, IMCD, and 3T3-L cell lines was confirmed by fluorescencemicroscopy (vide infra). Real-time imaging showed +36 GFP bound rapidlyto the cell membrane of HeLa cells and was internalized within minutesas punctate foci that migrated towards the interior of the cell andconsolidated into larger foci, consistent with uptake via endocytosis.

Determinants of Cell-Penetration Potency.

To determine the effect of net charge, charge distribution, and chargestructure on cell-penetration potency of supercharged proteins, cellswere treated with supercharged GFP protein variants with various, evenlydistributed net charges (supercharged GFP series), unevenly distributedand/or unstructured charges (+48 GFP chimera series and GFP with +10Lys/Arg tail) (FIG. 47). A large potency increase at ˜+22 (˜0.8 chargeunits per kD) was observed in the supercharged GFP series. Chargedistribution and charge structure also had a marked effect on cellpenetration potency, suggesting that not just charge magnitude, but alsocharge distribution and protein structure determine cell-penetrationcharacteristics in the high-potency regime.

Mechanistic Probes of +36 GFP Cell Penetration

To illuminate the mechanism by which +36 GFP enters cells, the cellpenetration experiments were repeated in HeLa cells under a variety ofconditions that each blocks a different component of an endocytosispathway (Payne C K, Jones S A, Chen C, Zhuang X (2007) Internalizationand trafficking of cell surface proteoglycans and proteoglycan-bindingligands. Traffic 8: 389-401; Veldhoen S, Laufer S D, Trampe A, Restle T(2006) Cellular delivery of small interfering RNA by a non-covalentlyattached cell-penetrating peptide: quantitative analysis of uptake andbiological effect. Nucleic Acids Res 34: 6561-6573; each of which isincorporated herein by reference). Cell penetration of +36 GFP was notobserved when HeLa cells were cooled to 4° C. prior to and during +36GFP treatment (FIG. 17B). This result suggests that uptake of +36 GFPrequires an energy-dependent process, consistent with endocytosis(Deshayes S, Morris M C, Divita G, Heitz F (2005) Cell-penetratingpeptides: tools for intracellular delivery of therapeutics. Cell MolLife Sci 62: 1839-1849; incorporated herein by reference). The effectsof 5 μg/mL filipin or 25 μg/mL nystatin, small molecules known toinhibit caveolin-dependent endocytosis, were evaluated. Neitherinhibitor significantly altered +36 GFP internalization (FIGS. 17C and17D, respectively). Treatment with chlorpromazine, a known inhibitor ofclathrin-mediated endocytosis, similarly had little effect on +36 GFPcell penetration (FIG. 17E). In addition, simultaneous treatment of HeLacells with 50 nM+36 GFP and 10 μg/mL of fluorescently labeledtransferrin, a protein known to be internalized in a clathrin-dependentmanner (Hopkins C R, Trowbridge I S (1983) Internalization andprocessing of transferrin and the transferrin receptor in humancarcinoma A431 cells. J Cell Biol 97: 508-521; incorporated herein byreference), resulted in little GFP/transferrin co-localization (FIG.17F). Treatment with cytochalasin D, an actin polymerization inhibitor,however, significantly decreased +36 GFP cell penetration (FIG. 17G).Taken together, these results are consistent with a model in which +36GFP uptake proceeds through an endocytotic pathway that isenergy-dependent, requires actin polymerization, and does not requireclathrin or caveolin.

Based on previous studies on the mechanism of cellular uptake ofcationic peptides (Payne C K, Jones S A, Chen C, Zhuang X (2007)Internalization and trafficking of cell surface proteoglycans andproteoglycan-binding ligands. Traffic 8: 389-401; Fuchs S M, Raines R T(2004) Pathway for polyarginine entry into mammalian cells. Biochemistry43: 2438-2444; each of which is incorporated herein by reference), itwas suggested that anionic cell-surface proteoglycans might serve asreceptors to mediate +36 GFP internalization. To probe this hypothesisHeLa cells were pre-treated with 80 mM sodium chlorate, an inhibitor ofATP sulphurylase, an enzyme required for the biosynthesis of sulfatedproteoglycans (Baeuerle P A, Huttner W B (1986) Chlorate—a potentinhibitor of protein sulfation in intact cells. Biochem Biophys ResCommun 141: 870-877; incorporated herein by reference). These conditionscompletely blocked +36 GFP penetration (FIG. 17H). As a further probe ofthe role proteoglycans play in +36 GFP uptake, internalization wascompared in wild-type Chinese hamster ovary (CHO) cells withproteoglycan-deficient CHO cells (PGD-CHO) that lack xylosyltransferase,an enzyme required for glycosaminoglycan synthesis. Wild-type CHO cells(FIG. 17I), but not PGD-CHO cells (FIG. 17J), efficiently internalized+36 GFP. These findings suggest that +36 GFP penetration of mammaliancells requires binding to sulfated cell-surface peptidoglycans.

+36 GFP Binds siRNA and Delivers siRNA into a Variety of Mammalian CellLines

The ability of superpositively charged proteins to form complexes withDNA and tRNA was previously reported (Lawrence et al. (2007)Supercharging proteins can impart unusual resilience. J Am Chem Soc 129:10110-10112; incorporated herein by reference). In light of theseresults, the ability of +15, +25, and +36 GFP to bind siRNA in vitro ina variety of stoichiometric ratios was evaluated. Using a gel-shiftassay (Kumar P, Wu H, McBride J L, Jung K E, Kim M H, et al. (2007)Transvascular delivery of small interfering RNA to the central nervoussystem. Nature 448: 39-43; incorporated herein by reference), binding of+25 and +36 GFP to siRNA with a stoichiometry of ˜2:1 was observed,while greater than five +15 GFP proteins on average were required tocomplex a single siRNA molecule (FIG. 18A). In contrast, 100 equivalentsof sfGFP did not detectably bind siRNA under the assay conditions.

Next the ability of +15, +25, and +36 GFP to deliver bound siRNA intoHeLa cells was examined. A Cy3-conjugated GAPDH siRNA (Ambion) wasbriefly mixed with 200 nM+36 GFP and the resulting mixture was added tocells in serum-free media for 4 hours. The cells were washed three timeswith PBS containing heparin and analyzed by flow cytometry for Cy3-siRNAuptake. It was observed that +25 and +36 GFP delivered 100- and1000-fold more siRNA into HeLa cells, respectively, than treatment withsiRNA alone (FIG. 3B), and ˜20-fold more siRNA than was delivered withthe common cationic lipid transfection reagent Lipofectamine 2000 (FIG.18C). In contrast, +15 GFP did not efficiently transfect siRNA into HeLacells (FIG. 18B).

In addition to HeLa cells, +36 GFP was able to efficiently deliver siRNAin IMCD cells, 3T3-L preadipocytes, rat pheochromocytoma PC12 cells, andJurkat T-cells, four cell lines that are resistant to siRNA transfectionusing Lipofectamine 2000 (Carlotti F, Bazuine M, Kekarainen T, Seppen J,Pognonec et al. (2004) Lentiviral vectors efficiently transducequiescent mature 3TL-L1 adipocytes. Mol Ther 9: 209-217; Ma H, Zhu J,Maronski M, Kotzbauer P T, Lee V M, Dichter M A, et al. (2002)Non-classical nuclear localization signal peptides for high efficiencylipofection of primary neurons and neuronal cell lines. Neuroscience112: 1-5; McManus M T, Haines B B, Dillon C P, Whitehurst C E, vanParijs L, et al. (2002) Small interfering RNA-mediated gene silencing inT lymphocytes. J Immunol 169: 5754-5760; Strait K A, Stricklett P K,Kohan J L, Miller M B, Kohan D E (2007) Calcium regulation ofendothelin-1 synthesis in rat inner medullary collecting duct. Am JPhysiol Renal Physiol 293: F601-606; each of which is incorporatedherein by reference). Treatment with Lipofectamine 2000 and Cy3-siRNAresulted in efficient siRNA delivery in HeLa cells, but no significantdelivery of siRNA into IMCD, 3T3-L, PC12, or Jurkat cells (FIG. 18C).Treatment of IMCD or 3T3-L cells with Fugene 6 (Roche), a differentcationic lipid transfection agent, and Cy3-siRNA also did not result insignificant siRNA delivery these cells (FIG. 24). In contrast, treatmentwith +36 GFP and Cy3-siRNA resulted in significant siRNA levels in allfive cell lines tested (FIG. 18C). Compared with Lipofectamine 2000, +36GFP resulted in 20- to 200-fold higher levels of Cy3 signal in allcases. Based on the effectiveness of three heparin washes at removingnon-internalized +36 GFP, (FIG. 22) these higher Cy3 levels can beattributed to higher levels of internalized Cy3-siRNA rather than tocell surface-bound +36 GFP/Cy3-siRNA complexes. Consistent with thisinterpretation, fluorescence microscopy of the adherent cell lines usedin this study (HeLa, IMCD, and 3T3-L) reveal internalized Cy3-siRNA and+36 GFP in punctate foci that presumed to be endosomes (FIG. 18D). Theseresults collectively indicate that +36 GFP can effectively deliver siRNAinto a variety of mammalian cell lines, including several that arepoorly transfected by commonly used cationic lipid transfectionreagents.

When HeLa cells were treated with the a premixed solution containing 200nM+36 GFP and 50 nM Cy3-siRNA in the presence of cytochalasin D or at 4°C., no internalized GFP or Cy3 siRNA was observed (FIG. 30). These datasupport a mechanism of siRNA delivery that is dependent on endocytosisand actin polymerization, consistent with the present inventors'mechanistic studies of +36 GFP in the absence of siRNA.

Size and Cytotoxicity of +36 GFP-siRNA Complexes.

+36 GFP-siRNA complexes were analyzed by dynamic light scattering (DLS)using stoichiometric ratios identical to those used for transfection.From a mixture containing 20 μM+36 GFP and 5 μM siRNA, a fairlymonodisperse population of particles with a hydrodynamic radius (Hr) of880.6±62.2 nm was observed (FIG. 31A), consistent with microscopy data(FIG. 31B). These observations demonstrate the potential for +36 GFP toform large particles when mixed with siRNA, a phenomena observed byprevious researchers using cationic delivery reagents (Deshayes et al.,2005, Cell Mol. Life Sci., 62:1839-49; and Meade and Dowdy, 2008, Adv.Drug Deliv. Rev., 60:530-36; both of which are incorporated herein byreference).

To assess the cytotoxicity of +36 GFP-siRNA complexes, MTT assays wereperformed on all five cell lines 24 hours after treatment with 0.2 to 2μM+36 GFP and 50 nM siRNA. These assays revealed no significant apparentcytotoxicity to HeLa, IMCD, 3T3-L, PC12, or Jurkat cells (FIG. 25A).

Gene Silencing with +36 GFP-Delivered siRNA

While the above results demonstrate the ability of +36 GFP to deliversiRNA into a variety of mammalian cells, they do not establish theavailability of this siRNA for gene silencing. Based on the punctatelocalization of intracellular +36 GFP (FIG. 18D), it was suggested thatgene silencing would require at least partial escape of +36GFP-transfected siRNA from endosomes. To evaluate the gene suppressionactivity of siRNA delivered with +36 GFP, HeLa, IMCD, 3T3-L, PC12, andJurkat cells were treated with a solution containing 50 nM ofGAPDH-targeting siRNA and either ˜2 μM Lipofectamine 2000 or 200 nM+36GFP. Cells were exposed to the siRNA transfection solution for 4 hours,then grown for up to 4 days.

In HeLa cells, observed decreases in GAPDH mRNA and protein levelsindicate that both Lipofectamine 2000 and +36 GFP mediate efficientsiRNA-induced suppression of GAPDH expression with similar kinetics.GAPDH-targeting siRNA delivered with Lipofectamine 2000 or +36 GFPresulted in a ˜85% decrease in GAPDH mRNA level after 72 hours (FIG.19A). Similarly, a decrease in GAPDH protein levels of −75% was observedin HeLa cells 96 hours after delivery of siRNA with Lipofectamine 2000or with +36 GFP (FIG. 19B). Similarly, delivery of β-actin targetingsiRNA with either ˜2 μM Lipofectamine 2000 or 200 nM+36 GFP resulted ina decrease in β-actin protein levels in HeLa cells of 70-78% for bothtransfection agents (FIG. 19B).

In contrast to the efficiency of gene suppression in HeLa cells,treatment with Lipofectamine 2000 and 50 nM siRNA in IMCD, 3T3-L, PC12,and Jurkat cells effected no significant decrease in GAPDH proteinlevels (FIG. 19C), consistent with the resistance of these cell lines tocationic lipid-mediated transfection (FIG. 18C). However, treatment with200 nM+36 GFP and 50 nM siRNA resulted in 44-60% suppression of GAPDHprotein levels in IMCD, 3T3-L, and PC12 cells (FIG. 19C). Despiteefficient siRNA delivery by +36 GFP (FIG. 18C), no significantsiRNA-mediated suppression of GAPDH expression in Jurkat cells wasobserved (FIG. 19C).

We speculated that enhancing the escape of +36 GFP-delivered siRNA fromendosomes may increase the effectiveness of gene silencing. In anattempt to chemically disrupt endocytotic vesicles, cells were treatedwith 200 nM+36 GFP and 50 nM siRNA together with either chloroquine, asmall molecule known to have endosomolytic activity (Erbacher P, Roche AC, Monsigny M, Midoux P (1996) Putative role of chloroquine in genetransfer into a human hepatoma cell line by DNA/lactosylated polylysinecomplexes. Exp Cell Res 225, 186-194; incorporated herein by reference),or pyrene butyric acid, which has been shown to increase cytosolicdistribution of internalized poly-arginine (Takeuchi T, Kosuge M,Tadokoro A, Sugiura Y, Nishi M, et al. (2006) Direct and rapid cytosolicdelivery using cell-penetrating peptides mediated by pyrenebutyrate. ACSChem Biol 1: 299-303; incorporated herein by reference). Addition ofthese reagents to mixtures containing +36 GFP and siRNA proved cytotoxicin the cell lines tested. In addition, we generated and purified aC-terminal fusion of +36 GFP and the hemagglutinin 2 (HA2) peptide,which has been reported to enhance endosome degradation (Lundberg P,El-Andaloussi S, Sutlu T, Johansson H, Langel U (2007) Delivery of shortinterfering RNA using endosomolytic cell-penetrating peptides. FASEB J21: 2664-2671; incorporated herein by reference). As was the case with+36 GFP, the HA2-fused variant exhibited low cytotoxicity in the fivecell lines tested (FIG. 25A). While the delivery of siRNA with +36GFP-HA2 fusion resulted in decreased GAPDH protein levels in HeLa, IMCD,3T3-L, and PC12 cells, the degree of suppression was comparable to thatarising from the use of +36 GFP (FIG. 19C).

Together, these results indicate that +36 GFP and +36 GFP-HA2 arecapable of delivering siRNA and effecting gene silencing in a variety ofmammalian cells, including some cell lines that do not exhibit genesilencing when treated with siRNA and cationic lipid-based transfectionagents.

Stability of +36 GFP and Stability of RNA and DNA Complexed with +36 GFP

In addition to generality across different mammalian cell types and lowcytotoxicity, siRNA delivery agents may be resistant to rapiddegradation. Treatment of +36 GFP with proteinase K (a robust,broad-spectrum protease) revealed that +36 GFP exhibits significantprotease resistance compared with bovine serum albumin. While nouncleaved BSA remained one hour after proteinase K digestion, 68% of +36GFP remained uncleaved after one hour, and 48% remained uncleaved aftersix hours (FIG. 32A). We also treated +36 GFP with murine serum at 37°C. (FIG. 32B). After six hours, no significant degradation was observed,suggesting its potential in vivo serum stability. In comparison, whenbovine serum albumin was incubated in mouse serum for the same period oftime, 71% degradation was observed after three hours, and completedegradation by four hours.

The ability of +36 GFP to protect siRNA and plasmid DNA from degradationwas assessed. siRNA or siRNA pre-complexed with +36 GFP was treated withmurine serum at 37° C. After three hours, only 5.9% of the siRNAremained intact in the sample lacking +36 GFP, while 34% of the siRNAremained intact in the sample pre-complexed with +36 GFP (FIG. 32C).Similarly, while plasmid DNA was nearly completely degraded by murineserum after 30 minutes at 37° C., virtually all plasmid DNApre-complexed with +36 GFP remained intact after 30 minutes, and 84% ofplasmid DNA was intact after one hour (FIG. 32D). These results togetherindicate that +36 GFP is capable of significantly inhibitingserum-mediated siRNA and plasmid DNA degradation.

Comparison of +36 GFP with Synthetic Cationic Peptides

To probe the features of superpositively charged GFPs that impart theirability to deliver siRNA into cells, we compared the siRNA transfectionability of +36 GFP at 200 nM with that of a panel of synthetic cationicpeptides at 200 nM or 2 μM. This panel consisted of poly-(L)-Lys (amixture containing an average of ˜30 Lys residues per polypeptide),poly-(D)-Lys, Arg₉, and a synthetic +36 peptide ((KKR)₁₁RRK) thatcontains the same theoretical net charge and Lys:Arg ratio as +36 GFP.MTT assays on HeLa cells treated with these synthetic polycationsindicated low cytoxicity at the concentrations used, consistent withthat of superpositively charged GFPs (FIG. 25B). None of the foursynthetic peptides tested delivered a detectable amount of Cy3-siRNAinto HeLa cells as assayed by flow cytometry, even when used atconcentrations 10-fold higher than those needed for +36 GFP to effectefficient siRNA delivery or for +15 GFP to effect detectable siRNAdelivery (FIG. 20).

Coupled with our observation that +15 GFP exhibits low cell penetrationand siRNA binding activity in comparison to +25 and +36 GFP (FIGS. 18Aand 18B), these results indicate that while GFP must be sufficientlypositively charged to acquire the ability to enter cells and transfectsiRNA efficiently, positive charge magnitude and charge density are notsufficient to confer transfection activity. Instead, our findingssuggest that protein-like features of +36 GFP such as size, globularshape, or stability may be required to achieve the full set of cellpenetration and siRNA transfection activities that we observed.

+36 GFP-Mediated Transfection of Plasmid DNA

Similar to the case with siRNA, we observed by gel-shift assay that +36GFP forms a complex with plasmid DNA (FIG. 26). To test if +36 GFP candeliver plasmid DNA to cells in a manner that supports plasmid-basedgene expression, we treated HeLa, IMCD, 3T3-L, PC12, and Jurkat cellswith a β-galactosidase expression plasmid premixed with Lipofectamine2000, +36 GFP, or a C-terminal fusion of +36 GFP and the hemagglutinin 2(HA2) peptide, which has been reported to enhance endosome degradation(Lundberg et al., 2007, Faseb J., 21:2664-71; incorporated herein byreference). After 24 hours, cells were analyzed for β-galactosidaseactivity using a fluorogenic substrate-based assay.

Consistent with our previous results (FIGS. 18 and 19), Lipofectamine2000 treatment resulted in significant β-galactosidase activity in HeLacells, but only modest β-galactosidase activity in PC12 cells, and nodetectable activity in any of the other three cell lines tested (FIG.21). In contrast, plasmid transfection mediated by 2 μM+36 GFP-HA2resulted in significant β-galactosidase activity in HeLa, IMCD, and3T3-L cells, and modest activity in PC12 cells (FIG. 21). Interestingly,treatment with plasmid DNA and 2 μM+36 GFP did not result in detectableβ-galactosidase activity (FIG. 21), suggesting that thehemagglutinin-derived peptide enhances DNA transfection or plasmid-basedexpression efficiency despite its lack of effect on siRNA-mediated genesilencing (FIG. 19C).

These results collectively indicate that +36 GFP-HA2 is able to deliverplasmid DNA into mammalian cells, including several cell lines resistantto cationic lipid-mediated transfection, in a manner that enablesplasmid-based gene expression. Higher concentrations of +36 GFP-HA2 arerequired to mediate plasmid DNA transfection than the amount of +36 GFPor +36 GFP-HA2 needed to induce efficient siRNA transfection.

Conclusion

The present inventors have characterized the cell penetration, siRNAdelivery, siRNA-mediated gene silencing, and plasmid DNA transfectionproperties of three superpositively charged GFP variants with netcharges of +15, +25, and +36. The present inventors discovered that +36GFP is highly cell permeable and capable of efficiently delivering siRNAinto a variety of mammalian cell lines, including those resistant tocationic lipid-based transfection, with low cytotoxicity.

Mechanistic studies revealed that +36 GFP enters cells through aclathrin- and caveolin-independent endocytosis pathway that requiressulfated cell-surface proteoglycans and actin polymerization. Thisdelivery pathway differs from previously described strategies fornucleic acid delivery to eukaryotic cells that rely on cell-specifictargeting to localize their nucleic acid cargo (Song et al., 2005, Nat.Biotechnol., 23:709-17; Kumar et al., 2007, Nature, 448:39-43; andCardoso et al., 2007, J. Gene Med., 9:170-83; all of which areincorporated herein by reference). For use in cell culture and even incertain in vivo applications, a general, noncell type-specific approachto nucleic acid delivery may be desirable.

In four of the five cell lines tested, +36 GFP-mediated siRNA deliveryinduces significant suppression of gene expression. Moreover, a +36GFP-hemagglutinin peptide fusion can mediate plasmid DNA transfection ina manner that enables plasmid-based gene expression in the same fourcell lines. The presently demonstrated ability to transfect RNA 21 basepairs in length as well as plasmid DNA over 5,000 bp in length suggeststhat +36 GFP and its derivatives may serve as general nucleic aciddelivery vectors.

Many traditional delivery methods rely on the synthesis of covalentlylinked transfection agent-nucleic acid conjugates such as, carbonnanotube-siRNA (Liu et al., 2007, Agnew Chem. Int. Ed. Engl.,46:2023-27; incorporated herein by reference), nanoparticle-siRNA (Rosiet al., 2006, Science, 312:1027-30; incorporated herein by reference),TAT peptide-siRNA (Fisher et al., 2002, J. Biol. Chem., 277:22980-84;incorporated herein by reference), cholesterol-siRNA (Soutschek et al.,2004, Nature, 432:173-78; incorporated herein by reference), and dynamicpolyconjugate-siRNA (Rozema et al., 2007, Proc. Natl. Acad. Sci., USA,104:12982-87; incorporated herein by reference). Use of +36 GFP simplyrequires mixing the protein and nucleic acid together. Moreover, thereagent described here is purified directly from bacterial cells andused without chemical co-transfectants such as exogenous calcium orchloroquine.

The present inventors previously reported that +36 GFP isthermodynamically almost as stable as sfGFP but unlike the latter isable to refold after boiling and cooling (Lawrence et al., 2007, J. Am.Chem. Soc., 129:10110-12; incorporated herein by reference). The presentinventors have now demonstrated that +36 GFP exhibits resistance toproteolysis, stability in murine serum, and significant protection ofcomplexed siRNA in murine serum. Thus, the present invention encompassesthe recognition that these systems may be useful for in vivo nucleicacid delivery (e.g., to human, mammalian, non-human, or non-mammaliancells).

Thus, the present invention describes for the first time use of proteinresurfacing methods for the potent delivery of nucleic acids intomammalian cells. This surprising and significant potency (Deshayes etal., 2007, Meth. Mol. Biol., 386:299-308; and Lundberg et al., 2007,Faseb J., 21:2664-71; both of which are incorporated herein byreference) is complemented by low cytotoxicity, stability in mammalianserum, generality across various mammalian cell types including severalthat resist traditional transfection methods, the ability to transfectboth small RNAs and large DNA plasmids, straightforward preparation fromE. coli cells, and simple use by mixing with an unmodified nucleic acidof interest. Thus the present invention encompasses the recognition thatsupercharged proteins represent a new class of solutions to generalnucleic acid delivery problems in mammalian cells.

Materials and Methods Cell Culture

HeLa, IMCD, PC12, and 3T3-L cells were cultured in Dulbecco'smodification of Eagle's medium (DMEM, purchased from Sigma) with 10%fetal bovine serum (FBS, purchased from Sigma), 2 mM glutamine, 5 I.U.penicillin, and 5 μg/mL streptamycin. Jurkat cells were cultured in RPMI1640 medium (Sigma) with 10% FBS, 2 mM glutamine, 5 I.U. penicillin, and5 streptamycin. All cells were cultured at 37° C. with 5% CO₂. PC12cells were purchased from ATCC.

Expression and Purification of Supercharged GFP Proteins

Supercharged GFP variants (protein sequences are listed below) werepurified using a variation on our previously reported method.Overexpression plasmids were constructed on a pETDuet-1 backbone. Genesencoding mCherry and Cre were subcloned with a C-terminal His6 taginstalled using a PCR primer. Genes encoding Tat, Arg10, penetratin, or+36 GFP and a (GGS)9 linker were inserted N-terminal of mCherry and Creby USER cloning. The genes encoding the (GGS)9 linker, Tat, Arg10 andpenetratin were designed by Gene Designer (DNA 2.0) and ordered asseparate complementary DNA strands, phosphorylated using T4 PNK, andhybridized prior to cloning. Sequences encoding N-terminal His6-taggedubiquitin, and the corresponding G76V mutant, were assembled from a setof overlapping oligonucleotides) and ligated into NcoI and NheIrestriction enzyme cleavage sites upstream of pET-+36 GFP2 to create afusion directly to the N-terminus of +36 GFP. Plasmids created in thiswork will be accessible through Addgene. Briefly, GFP was overexpressedin BL21(DE3) E. coli. Cells were lysed by sonication in 2 M NaCl in PBSwhich was found to increase overall yield of isolated GFP, and purifiedas previously described (Lawrence M S, Phillips K J, Liu D R (2007)Supercharging proteins can impart unusual resilience. J Am Chem Soc 129:10110-10112; incorporated herein by reference). Purified GFPs werequantitated by absorbance at 488 nm assuming an extinction coefficientof 8.33×10⁴ M⁻¹cm⁻¹ (Pedelacq J D, Cabantous S, Tran T, Terwilliger T C,Waldo G S (2006) Engineering and characterization of a superfolder greenfluorescent protein. Nat Biotechnol 24: 79-88; incorporated herein byreference). Protein purity was evaluated by SDS PAGE and Coomassie Bluestaining (FIG. 27). Fluorescence emission spectra of the GFP variantsused in this work are similar (FIG. 28).

Protein Sequences of Supercharged GFP Variants

−30 GFP: (SEQ ID NO: 97)          MGHHHHHHGGASKGEELFDGVVPILVELDGDVNGHEFSVRGEGEGDATEGELTLKFICTTGELPVPWPTLVTTLTYGVQCFSDYPDHMDQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHDVYITADKQENGIKAEFEIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDDHYLSTESALSKDPNEDRDHMVLLEFVTAAGID HGMDELYK +15 GFP:(SEQ ID NO: 98)           MGHHHHHHGGASKGERLFTGVVPILVELDGDVNGHKFSVRGEGEGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPEGYVQERTISFKKDGTYKTRAEVKFEGRTLVNRIELKGRDFKEKGNILGHKLEYNFNSHNVYITADKRKNGIKANFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSALSKDPKEKRDHMVLLEFVTAAGIT HGMDELYK +25 GFP:(SEQ ID NO: 99)           MGHHHHHHGGASKGERLFTGVVPILVELDGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGTYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHNVYITADKRKNGIKANFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSALSKDPKEKRDHMVLLEFVTAAGIT HGMDELYK +36 GFP:(SEQ ID NO: 100)           MGHHHHHHGGASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIK HGRDERYK +36 GFP-HA2:(SEQ ID NO: 101)           MGHHHHHHGGASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYKGSAGSAAGSGEFGLFGAIAGFIENGWEGMIDG

Gel-Shift Assay

Gel-shift assays were based on the method of Kumar et al. (Kumar P, WuH, McBride J L, Jung K E, Kim M H, et al. (2007) Transvascular deliveryof small interfering RNA to the central nervous system. Nature 448:39-43; incorporated herein by reference). siRNA (10 pmol) or plasmid DNA(22 fmol) was mixed with the specified quantity of a GFP variant inphosphate buffered saline (PBS) for 10 minutes at 25° C. The resultingsolution was analyzed by non-denaturing electrophoresis using a 15%acrylamide gel for siRNA or a 1% agarose gel for plasmid DNA, stainedwith ethidium bromide, and visualized with UV light.

Cationic Lipid-Based and GFP-Based Transfection

Transfections using Lipofectamine 2000 (Invitrogen) and Fugene 6 (Roche)were performed following the manufacturer's protocol. Although themolecular weight of these reagents are not provided by the manufacturer,the working concentration of Lipofectamine 2000 during transfection is 2μg/mL and based on an assumption that the molecular weight of thiscationic lipid is ≦1,000 Da it was estimated that this concentrationcorresponds to ≧˜2 μM.

Cells were plated in a 12-well tissue culture plate at a density of80,000 cells per well. After 12 hours at 37° C., the cells were washedwith 4° C. (PBS) and for HeLa, IMCD, 3T3-L, and PC12 cells the mediawere replaced with 500 μL of serum-free DMEM at 4° C.

Jurkat cells were transferred from the culture plate wells intoindividual 1.5 mL tubes, pelleted by centrifugation, and resuspended in500 μL of serum-free RPMI 1640 at 4° C.

A solution of GFP and either siRNA or plasmid DNA was mixed in 500 μL ofeither 4° C. DMEM (for HeLa, IMCD, 3T3-L, and PC12 cells) or 4° C. RPMI1640 (for Jurkat cells). After 5 min at 25° C., this solution was addedto the cells and slightly agitated to mix. After 4 hours at 37° C., thesolution was removed from the cells and replaced with 37° C. mediacontaining 10% FBS. GAPDH-targeting Cy3-labeled siRNA and unlabeledsiRNA were purchased from Ambion. Plasmid transfections were performedusing pSV-β-galactosidase (Promega). β-galactosidase activity wasmeasured using the β-fluor assay kit (Novagen) following themanufacturer's protocol.

Fixed-Cell Imaging

Four hours after treatment with GFP and Cy3-siRNA, cells weretrypsinized and replated in medium containing 10% FBS on glass slidescoated with Matrigel (BD Biosciences). After 24 hours at 37° C., cellswere fixed with 4% formaldehyde in PBS, stained with DAPI whereindicated, and imaged with a Leica DMRB inverted microscope equippedwith filters for GFP and Cy3 emission. Images were prepared usingOpenLab software (Improvision). Exposure times for GFP and Cy3 werefixed at 350 msec and 500 msec, respectively.

Live-Cell Imaging

For experiments using small-molecule inhibitors, cells were plated on aglass-bottomed tissue culture plate (MatTek, 50 mm uncoated plasticdishes with #1.5 glass thickness and a 14 mm glass diameter) andincubated with inhibitor for 1 hour at 37° C., followed by treatmentwith 50 nM+36 GFP and inhibitor for an additional 1 hour at 37° C. Theresulting cells were washed three times with PBS containing theinhibitor and 20 U/mL heparin to remove surface-associated GFP, with theexception that cells treated with 50 nM+36 GFP at 4° C. were washed onlyone time with PBS containing 20 U/mL heparin to remove GFP bound to theglass slide but to still allow a perimeter of some cell surface-boundGFP to be visible.

Cells were imaged using an inverted microscope (Olympus IX70) in anepi-fluorescent configuration with an oil-immersion objective (numericalaperture 1.45, 60×, Olympus). GFP was excited with the 488 nm line anargon ion laser (Melles-Griot), and Alexa Fluor 647 was excited with a633 nm helium-neon laser (Melles-Griot). Long- and short-wavelengthemissions were spectrally separated by a 650 nm long-pass dichroicmirror (Chroma) and imaged onto a CCD camera (CoolSnap HQ). A 665 nmlong-pass filter was used for Alexa Fluor 647 detection, and a 535/20 nmbandpass filter for GFP. Imaging was conducted at 37° C.

RT-QPCR

Cells were washed with PBS 48, 72, or 96 hours after transfection andtotal RNA was extracted using the Ribopure kit (Ambion) following themanufacturer's protocol. Samples were treated with 1 uL DNase I (Ambion)and incubated for 30 minutes at 37° C. DNase I was inactived with DNaseI Inactivation Reagent (Ambion) following the manufacturer's protocol.Complementary DNA was generated from 800 ng of RNA using the Retroscriptkit (Ambion) following the manufacturer's protocol. QPCR reactionscontained 1×IQ SYBR green Master Mix (BioRad), 3 nM ROX reference dye(Stratagene), 2.5 μL of reverse transcription reaction mixture, and 200nM of both forward and reverse primers:

(SEQ ID NO: 102) Forward GAPDH 5′-CAACTCACTCAAGATTGTCAGCAA-3′(SEQ ID NO: 103) Reverse GAPDH 5′-GGGATGGACTGTGGTCATGA-3′(SEQ ID NO: 104) Forward β-actin 5′-ATAGCACAGCCTGGATAGCAACGTAC-3′(SEQ ID NO: 105) Reverse β-actin 5′-CACCTTCTACAATGAGCTGCGTGTG-3′

QPCR reactions were subjected to the following program on a StratageneMX3000p QPCR system: 15 minutes at 95° C., then 40 cycles of (30 secondsat 95° C., 1 minute at 55° C., and 30 seconds at 72° C.). Amplificationwas quantified during the 72° C. step. Dissociation curves were obtainedby subjecting samples to 1 minute at 95° C., 30 seconds at 55° C., and30 seconds at 95° C. and monitoring fluorescence during heating from 55°C. to 95° C. Threshold cycle values were determined using MxPro v3.0software (Stratagene) and analyzed by the ΔΔCt method.

Western Blotting

Cells were washed once with 4° C. PBS 96 hours after transfection. Cellswere lysed with 200 μL RIPA buffer (Boston Bioproducts) containing aprotease inhibitor cocktail (Roche) for 5 minutes. The resulting celllysate was analyzed by SDS-PAGE on a 4-12% acrylamide gel (Invitrogen).

The proteins on the gel were transferred by electroblotting onto a PVDFmembrane (Millipore) pre-soaked in methanol. Membranes were blocked in5% milk for 1 hour, and incubated in primary antibody in 5% milkovernight at 4° C. All antibodies were purchased from Abcam. Themembrane was washed three times with PBS and treated with secondaryantibody (Alexa Fluor 680 goat anti-rabbit IgG (Invitrogen) or AlexaFluor 800 rabbit anti-mouse IgG (Rockland)) in blocking buffer (Li-CORBiosciences) for 30 minutes. The membrane was washed three times with 50mM Tris, pH 7.4 containing 150 mM NaCl and 0.05% Tween-20 and imagedusing an Odyssey infrared imaging system (Li-COR Biosciences). Imageswere analyzed using Odyssey imaging software version 2.0. Representativedata are shown in FIG. 29. GAPDH suppression levels shown are normalizedto β-tubulin protein levels; 0% suppression is defined as the proteinlevel in cells treated with ˜2 μM Lipofectamine 2000 and 50 nM negativecontrol siRNA.

Flow Cytometry

Cells were washed three times with 20 U/mL heparin (Sigma) in PBS toremove non-internalized GFP. Adherent cells were trypsinized,resuspended in 1 mL PBS with 1% FBS and 75 U/mL DNase (New EnglandBiolabs). Flow cytometry was performed on a BD LSRII instrument at 25°C. Cells were analyzed in PBS using filters for GFP (FITC) and Cy3emission. At least 10⁴ cells were analyzed for each sample.

Synthetic Cationic Peptides

(Arg)₉ and (KKR)₁₁(RRK) (SEQ ID NO: 156) were purchased from ChiScientific and used at a purity of ≧95%. Poly-(L)-Lys and poly-(D)-Lyswere purchased from Sigma. Poly-(L)-Lys is a mixture with a molecularweight window of 1,000-5,000 Da, and a median molecular weight of 3,000Da. Poly-(D)-Lys is a mixture with a molecular weight window of1,000-5,000 Da, and a median molecular weight of 2,500 Da. Stocksolutions of all synthetic peptides were prepared at a concentration of20 μM in PBS.

+36 GFP-siRNA Particle Size Characterization

Dynamic light scattering was performed using a Protein Solution DynaProinstrument at 25° C. using 20 μM+36 GFP and 5 μM siRNA in PBS. Apurified 20-bp RNA duplex (5′ GCAUGCCAUUACCUGGCCAU 3′, from IDT; SEQ IDNO: 106) was used in these experiments. Data were modeled to fit anisotrophic sphere. 5 μL of solution analyzed by DLS (20 μM+36 GFP and 5μM siRNA in PBS) was imaged using a Leica DMRB inverted microscope.

Deubiquitination Assay Western Blot

Cells were plated in a 48-well plate at a density of 1×10⁵ cells perwell. After 18 h, cells were washed with cold PBS and incubated with 100nM ubiquitin-+36 GFP or 100 nM mutant G76V ubiquitin-+36GFP inserum-free media for 1 h. Cells were washed three times with 20 U/mLheparin in PBS, and lysed directly in LDS sample buffer and sonicated.

Crude HeLa cytosolic extract was prepared by harvesting 5×10⁶ HeLa cellsusing a plate scraper into ice-cold PBS. Cells were pelleted at 200 Gfor 5 min and resuspended in 1 mL of 50 mM Tris-HCl pH 7.5, 150 nM NaCl,2 mM EDTA, 2 mM DTT, 1.7 μg/mL aprotinin, 10 μg/mL leupeptin, 10 mMPMSF, and 0.5% NP-40. Homogenized cells were incubated on ice for 10 minbefore centrifugation at 13,000 G for 15 min to remove nuclei and celldebris. Either wt or mutant ubiquitin-+36 GFP (5 pmol) was added to thelysate. The mixture was incubated with or without either 10 mMN-ethylmaleimide or 20 μg/mL ubiquitin-aldehyde for 1 hour at 37° C.

Samples were analyzed on a 12% SDS-PAGE gel and transferred byelectroblot onto a PVDF membrane (Millipore) pre-soaked in methanol.Membranes were blocked in 5% milk for 1 h and incubated in primaryantibody in 3% BSA for 30 min at room temperature. Anti-GFP antibody(1/10,000 dilution, ab290) was purchased from Abcam. The membrane waswashed three times with PBS and treated with the secondary antibodyIRDye 800CW Goat Anti-Mouse IgG (1/10,000 dilution, Li-COR Biosciences)in blocking buffer (Li-COR Biosciences) for 30 min. The membrane waswashed three times with 50 mM Tris, pH 7.4, containing 150 mM NaCl and0.05% Tween-20 and visualized using an Odyssey infrared imaging system(Li-COR Biosciences). Images were analyzed using Odyssey imagingsoftware version 2.0.

Cre Reporter Cell Lines

Hela cells were plated at 3×10⁴ cells/well in 48-well plates. After 16hours, cells were transfected with pCALNL-DsRed26 using Effectenetransfection reagent (Qiagen). After incubation with 100-1000 nM of eachCre fusion protein for 4 hours in serum-free DMEM, cells were washedthree times with 20 U/mL heparin in PBS and incubated in full media for48 hours. Delivery of Cre was assayed by following DsRed2 expressionusing flow cytometry and fluorescence microscopy. Cre reporter 3T3 cellswere plated at 1×10⁵ cells/well in 48-well plates. After 16 hours, cellswere incubated with various concentrations of protein for 4 hours inserum-free media. Cells were washed with three times with 20 U/mLheparin in PBS and incubated in full media for 48 hours. Recombinedcells were quantified by X-gal staining and manual counting. BSR cellswere obtained from Matthias Schnell (Thomas Jefferson University). ApQCXIX MMLV retrovirus (Clontech) containing the tdTomato cre reporterconstruct was generated by subcloning the tdTomato gene (Clontech) intoa pCALNL backbone6 and packaged using Plat-E cells7. BSR cells wereinfected with retrovirus and integrants were selected for one week inthe presence of 1 mg/ml G418 (Sigma). BSR.LNL.tdTomato cells were platedat 1×10⁵ cells/well in 48 well plates. After 16 hours, cells wereincubated with various concentrations of protein for 4 hours inserum-free media. Cells were washed with three times with 20 U/mLheparin in PBS and incubated in full media for 48 hours. For chloroquinetreatment of BSR cells, cells were incubated with Cre fusion proteinsfor 4 hours in serum-free media containing 100 μM chloroquine, washedthree times with 20 U/mL heparin in PBS, and incubated 12 hours in fullmedia containing 100 μM chloroquine. Following this incubation, cellswere washed once with PBS and incubated a further 36 hours in full mediawithout chloroquine. Delivery of Cre was assayed by following tdTomatoexpression using flow cytometry and fluorescence microscopy.

For fluorescent Cre reporters, recombinants were identified by flowcytometry as those cells of the live-cell population that exhibitedsignificantly higher fluorescence than that of non-treated reportercells. Typically, the recombined population exhibited fluorescence atleast 10-fold higher than the non-recombined cells and were readilydetected as a distinct subpopulation. Fluorescence gates were drawnaccordingly to quantitate recombined and non-recombined cells.

In Vivo Retinal Injections

Adult CD1 mice were subretinally injected with 0.5 μL of 100 μM+36 GFP.After 6 hours, the retinas were harvested and analyzed by fluorescencemicroscopy. p0 pups were subretinally injected with 0.5 μL of 40 μMwtCre, Tat-Cre, or +36 GFP-Cre. After 72 hours, retinae were harvestedand fixed with 0.5% glutaraldehyde. Fixed retinae were stained withX-gal overnight and embedded in 50% OCT, 50% of 30% sucrose and storedat −80° C. Retinae were cut into 30 μm sections and imaged for X-galstaining on a Zeiss Axiophot brightfield microscope with a NikonCXM-1200F camera. Delivery of Cre was assayed by manually counting LacZ+cells.

Stability Assays

To assess siRNA stability in murine serum, siRNA (10 pmol) was mixedwith sfGFP (40 pmol), mixed with +36 GFP (40 pmol), or incubated alonein PBS for 10 minutes at 25° C. The resulting solution was added to fourvolumes of mouse serum (20 μL total) and incubated at 37° C. for theindicated times. 15 μL of the resulting solution was diluted in water toa total volume of 100 μL. 100 μL of TRI reagent (Ambion) and 30 μL ofchloroform was added. After vigorous mixing and centrifugation at 1,000G for 15 minutes, the aqueous layer was recovered. siRNA wasprecipitated by the addition of 15 μL of 3 M sodium acetate, pH 5.5, andtwo volumes of 95% ethanol. siRNA was resuspended in 10 mM Tris pH 7.5and analyzed by gel electrophoresis on a 15% acrylamide gel. Serumstability of +36 GFP when complexed with siRNA was simultaneouslymeasured by anti-GFP Western blot with 5 μL of the incubation.

To assess the stability of plasmid DNA complexed with +36 GFP in murineserum, plasmid DNA (0.0257 pmol) was mixed with either 2.57 pmol, 100eq. or 12.84 pmol, 500 eq. of either sfGFP or +36 GFP in 4 μL of PBS for10 minutes. To this solution was added 16 μL of mouse serum (20 μLtotal) and incubated at 37° C. for the indicated times. DNA was isolatedby phenol chloroform extraction and analyzed by gel electrophoresis on a1% agarose gel, stained with ethidium bromide, and visualized with UVlight.

To assess the stability of proteins in murine serum, 100 pmol of eachprotein in 2 μL of PBS was mixed with 8 μL of murine serum (Sigma) andincubated at 37° C. The samples were mixed with SDS protein loadingbuffer and heated to 90° C. for 10 minutes. The resulting mixture wasanalyzed by SDS-PAGE on a 4-12% acrylamide gel (Invitrogen) and imagedby Western blot.

To assess stability in the presence of proteinase K, 100 pmol of +36 GFPor BSA was treated with 0.6 units of proteinase K (New EnglandBiosciences) at 37° C. The samples were mixed with SDS protein loadingbuffer, heated to 90° C. for 10 minutes, and analyzed by SDS-PAGE on a4-12% acrylamide gel (Invitrogen).

Example 4 Supercharged Proteins are Effective Protein Delivery Reagents

mCherry, a fluorescent protein, was fused to each of +36 GFP (via acleavable linker having amino acid sequence ALAL, SEQ ID NO: 107), TAT,and Arg₉ to generate three mCherry fusion proteins. These fusions weretested for their ability to deliver mCherry to HeLa, IMCD, and PC12cells.

In order to assess how well +36 GFP delivers proteins to cells HeLa,PC12 and 3T3-L cells were treated with either (1) mCherry-TAT, (2)mCherry-R₉, or (3) mCherry-+36 GFP. Cells were treated with 50 nM, 500nM, 1 or 2 μM material for 4 hours in DMEM, followed by heparin wash andFACS.

mCherry-ALAL-+36 GFP penetrated cells much more potently thanmCherry-TAT or mCherry Arg₉ (FIG. 33). FIG. 34 shows internalization ofthese three fusions via fluorescence microscopy. Data show that +36 GFPis a highly potent and general protein delivery reagent (FIG. 34).

Example 5 Mining Genomes for Natural Supercharged Proteins

The present invention encompasses the recognition that genomes (e.g.,the human genome) can be mined to identify natural supercharged proteinsthat might be useful for delivery of agents (e.g., nucleic acids,proteins, etc.). Ten human proteins were expressed and purified (i.e.,C-Jun (Protein Accession No.: P05412); TERF 1 (P54274); Defensin 3(P81534); Eotaxin (Q9Y258); N-DEK (P35659); PIAS 1 (O75925); Ku70(P12956); Midkine (P21741); HBEGF (Q99075); HGF (P14210); SFRS12-IP1(Q8N9Q2); Cyclon (Q9H6F5)), and four of these (i.e., HBEGF, N-DEK,C-jun, and 2HGF) displayed the ability to bind to siRNA and deliversiRNA to cells (i.e., cultured HeLa cells).

Human proteins were assayed for binding to siRNA by gel shift assay.Gel-shift assays were based on the method of Kumar et al. (Kumar P, WuH, McBride J L, Jung K E, Kim M H, et al. (2007) Transvascular deliveryof small interfering RNA to the central nervous system. Nature 448:39-43; incorporated herein by reference). Ambion negative control siRNA(˜150 ng) was mixed with the specified quantity of human protein inphosphate buffered saline (PBS) for 10 minutes at 25° C. The resultingsolution was analyzed for unbound siRNA by non-denaturingelectrophoresis using a 15% acrylamide gel for siRNA, stained withethidium bromide, and visualized with UV light (FIG. 35A).

Human proteins were assayed for delivery of siRNA to HeLa cells. Cellswere plated in a 12-well tissue culture plate at a density of 80,000cells per well. After 12 hours at 37° C., the cells were washed with 4°C. (PBS) and replaced with 500 μL of serum-free DMEM at 4° C. A solutionof human protein and Ambion negative control Cy3-labeled siRNA was mixedin 500 μL of 4° C. DMEM. After 5 min at 25° C., this solution was addedto the cells and slightly agitated to mix. Final concentration of humanproteins was 1 micromolar and siRNA was 50 micromolar. After 4 hours at37° C., the solution was removed from the cells and replaced with 37° C.media containing 10% FBS. Cells were then analyzed for siRNA delivery byfixed cell imaging and flow cytometry. Internalization of protein-siRNAcomplexes is shown in FIG. 35B.

HeLa cells were transfected with Ambion Cy3-labeled siRNA using humanproteins, incubated for three days, and then assayed for degradation ofa targeted mRNA (FIG. 35C). Targeted GAPDH mRNA levels were compared toβ-actin mRNA levels. “Control” indicates use of a non-targeting siRNA.Lipofectamine 2000 was used as a positive control.

Delivery of siRNA and Functional Protein into Cells by SuperchargedHuman Proteins

To test whether naturally occurring supercharged human proteins, whichmay offer may offer less immunogenic alternatives to +36 GFP, can beemployed to deliver a nucleic acid or a functional protein to cells, weinvestigated the delivery characteristics of seven naturally occurringsupercharged proteins: HRX, c-Jun, Eotaxin, defensin3, HBEGF, N-DEK, andHGF (see FIG. 48 for net chare per kDa and distribution of Arg/Lysresidues. HeLa cells were assayed for Cy3 fluorescence after treatmentwith 1 μg Cy3-labelled siRNA+5 μg or the respective protein for 4 h byflow cytometry and fluorescence microscopy (FIG. 49). Efficient deliveryof siRNA was observed with some of the tested proteins, for example,with HBEGF, defensin-3, and N-DEK. Similarly, Cre recombinase wasefficiently delivered to BSR cells by various naturally occurringsupercharged proteins, for example, c-Jun, N-DEK, and Eotaxin (FIG. 50).The previously uncharacterized ability of superpositive human proteinsto penetrate cells suggests potential new biological roles.

Example 6 Pyrene Butyric Acid Improves Consistency of Gene Silencing

The present inventors have discovered that pyrene butyrate, anendosomolytic agent (Futaki et al., 2006, ACS Chem. Biol., 1:299;incorporated herein by reference), can increase gene silencing effectsand decrease batch-to-batch variability. Without wishing to be bound byany one particular theory, such variability may be caused by variableion endosome escape efficiency). Thus, the present inventors havedeveloped a method for improving the efficiency, consistency, andreproducibility of gene silencing.

The protocol below utilizes +36 GFP and pyrene butyric acid (PBA), butcan readily be generalized to any supercharged protein and anyendosomolytic agent (e.g., chloroquine, HA2, melittin).

HeLa cells were grown to ˜80% confluency in a 12-well plate. DMEM/10%FBS was removed and the cells were washed 3 times with PBS. To each wellwas added 1 mL of a solution containing 50 μM PBA in PBS. Cells wereincubated in this solution for 5 minutes at 37° C. In a small plastictube, 200 fmol of GAPDH-suppressing siRNA (2 μL of a 100 μM siRNAsolution) and 800 fmol+36 GFP were pre-mixed and allowed to incubate for5 minutes at 25° C. One quarter (¼) of the total volume of the siRNA/+36GFP complex was added to each well containing 1 mL 50 μM PBA in PBS. Thetissue culture tray was agitated slightly to homogenize the solution ineach well, resulting in a solution containing 50 μM siRNA and 200 μM+36GFP. Cells were incubated under these conditions for 3 hours at 37° C.The 50 μM PBA/PBS solution was removed and cells were washed three timeswith PBS, followed by the addition of 1 mL DMEM in 10% FBS. Cells wereincubated under these conditions for 4 days, and knockdown of GAPDHexpression was quantitated by Western blot.

About 20% cytotoxicity was observed after 3 hour incubation in 50 μMPBA/PBS. Much higher cytotoxicity (−80%) was observed when HeLa cellswere incubated in 50 μM PBA/PBS for ≧4 hours. Cytotoxicity of PBA mayvary by cell type.

Example 7 Potent Delivery of Functional Proteins into Mammalian Cells bySupercharged Green Fluorescent Protein

“Supercharged” GFP variants that have been extensively mutated at theirsurface-exposed residues to impart extremely high theoretical net chargemagnitudes ranging from −30 to +48 that can enter a variety of mammaliancells by binding to anionic cell-surface proteoglycans and undergoendocytosis in an energy-dependent and clathrin-independent fashion andcan deliver siRNA and plasmid DNA into a variety of mammalian cell lineswithout detectable cytotoxicity were previously described. (Lawrence etal., JACS 129, 10110-10112, 2007, McNaughton et al., Proc. Natl. Acad.Sci. U.S.A. 106, 6111-6116, 2009).

This Example describes that +36 GFP can be fused to a variety ofproteins while maintaining its ability to rapidly and potently penetratea variety of mammalian cell types. When delivered as fusions with +36GFP, mCherry, ubiquitin, and Cre recombinase all retain their nativefunctions, suggesting that fusions with +36 GFP can escape endosomes andtravel in functional form to the cytosol or the nucleus. When +36 GFP isfused to a protein of interest through a protease-sensitive linker, theinternalized protein of interest is readily cleaved from the +36 GFPmoiety both in vitro and in cells. Side-by-side comparisons of +36 GFP,Tat, and Arg₉ fused to mCherry or Cre recombinase revealed that fusionswith +36 GFP result in significantly higher levels of internalizedprotein (˜10- to 100-fold) and in greater efficiencies of Cre-inducedrecombination (˜2 to 15-fold). Collectively, these results suggest thatsuperpositively charged proteins may serve as a promising new tool forthe delivery of proteins into mammalian cells.

Mammalian Cell Penetration of +36 GFP and Tat Protein Fusions

The ability of +36 GFP to enter cells when genetically fused with avariety of other proteins was tested. Fusions of +36 GFP to mCherry(Shaner et al., Nat. Biotechnol. 22, 1567-1572, 2004), Cre recombinase(Abremski et al., Cell 32, 1301-1311, 1983), and ubiquitin (Schlesingeret al., Nature 255, 42304-42304, 1975) in various orientations and withdifferent linkers were generated. For mCherry and Cre, optimal yield ofpurified, full-length protein was obtained from +36GFP-(GGS)₄-ALAL-(GGS)₄-(protein of interest)-His₆. Full-lengthubiquitin-+36 GFP was optimally expressed and purified with anamino-terminal His₆ tag. Fusion architectures are shown schematically inFIG. 36 a and SDS-PAGE analyses of purified proteins are shown in FIG.39. The fluorescent properties of +36 GFP were not significantly alteredby fusion with Cre or ubiquitin (FIG. 40). The +36 GFP-mCherry fusionprotein exhibited a lower fluorescence emission at 515 nm and anadditional weaker fluorescence emission peak at 620 nm when excited at488 nm, consistent with Förster resonance energy transfer (FRET) fromthe +36 GFP fluorophore to the tethered mCherry fluorophore. Theseeffects were diminished upon proteolytic separation of mCherry and +36GFP (FIG. 40 d, e, f).

To test whether the ability of +36 GFP to penetrate mammalian cells wasretained in these fusions, HeLa cells were incubated with variousconcentrations of the three protein fusions in serum-free DMEM for 4hours at 37° C. After incubation, cells were washed three times withheparin under conditions that have been shown to remove protein bound tothe cell membrane (McNaughton et al., Proc. Natl. Acad. Sci. U.S.A. 106,6111-6116, 2009; Veldhoen et al., Nucleic Acids Res. 34, 6561-6573,2006). The degree of internalized GFP was measured by flow cytometry.For comparison, +36 GFP, non-supercharged starting GFP (stGFP (Lawrenceet al., JACS 129, 10110-10112, 2007), a single-mutant variant ofsuperfolder GFP (Pedelacq et al., Nat. Biotechnol. 24, 79-88, 2006)),and Tat-fused stGFP were also incubated with HeLa cells under the sameconditions. At concentrations up to 100 nM, both stGFP and Tat-stGFPexhibited little or no detectable cell penetration. Tat-stGFP penetratedHeLa cells modestly at 300 nM and significantly at 1 μM, while, asexpected, stGFP did not penetrate cells to a detectable extent even at 1μM (FIG. 36 b). In contrast, +36 GFP alone, +36 GFP-mCherry, +36GFP-Cre, and ubiquitin-+36 GFP all penetrated cells potently at lownanomolar concentrations. HeLa cells treated with 10 nM+36 GFP alone,+36 GFP-mCherry, +36 GFP-Cre, or ubiquitin-+36 GFP resulted in cellularlevels of GFP comparable to treatment with 1 μM Tat-stGFP. HeLa cellstreated with 1 μM+36 GFP, +36 GFP-mCherry, +36 GFP-Cre, or ubiquitin-+36GFP resulted in cellular levels of GFP ˜50- to 100-fold greater thanHeLa cells treated with 1 μM Tat-stGFP (FIG. 36 b).

Similarly, when the concentration of protein was fixed at 100 nM and theamount of internalized GFP was measured over time by flow cytometry, +36GFP and its protein fusions exhibited significant levels of internalizedprotein within 15 minutes (the earliest measurement). In contrast,significant levels of internalized protein were not observed by flowcytometry in HeLa cells treated with 100 nM Tat-stGFP until ˜8 hours(FIG. 36 c). The +36 GFP fusions penetrated cells after 2 hours ofincubation at 100 nM to an extent ˜20- to 100-fold greater than that ofTat-stGFP after 8 hours of incubation (FIG. 36 c). These results weresupported by fluorescence microscopy of HeLa cells fixed after a30-minute incubation with +36 GFP protein fusions at 100 nM (FIG. 36 d).

HeLa cells treated for 4 h with 2 nM+36 GFP mCherry, +36 GFP Cre, andubiquitin +36 GFP did not exhibit significant cytotoxicity by MTT assay(FIG. 41). Taken together, these results indicate that all threeproteins tested as fusions with +36 GFP retain the dose-dependentability of +36 GFP to penetrate HeLa cells potently (at nanomolarconcentrations), quickly (in minutes), and without apparentcytotoxicity. In addition, +36 GFP and all +36 GFP protein fusionstested exhibited significantly greater potency and speed of cellpenetration than that of Tat-stGFP, which behaved in a manner consistentwith previous descriptions of Tat-GFP fusions (Ryu et al., Mol. Cells.16, 385-391, 2003).

Comparison of mCherry Delivery by +36 GFP, Tat, and Arg₉

+36-GFP, Tat, and Arg₉-mediated protein delivery were compared in avariety of mammalian cells. HeLa cells, inner medullary collecting duct(IMCD) cells, and rat pheochromocytoma PC12 cells were incubated withvarious concentrations of +36 GFP-mCherry, Tat-mCherry, or Arg₉-mCherryfusion proteins for 4 hours at 37° C. After incubation, cells werewashed with heparin to remove membrane-bound proteins and assayed forinternalized mCherry by flow cytometry (FIG. 33). Depending on the cellline, +36 GFP delivered ˜10- to 100-fold more mCherry than either Tat orArg₉. Effective delivery of mCherry by 100 nM+36 GFP across cell lineswas further confirmed by fluorescence microscopy (FIG. 34). In all celllines, +36 GFP and mCherry were observed primarily as distinct green andred puncti (presumably endosomes based on our previous studies(McNaughton et al., Proc. Natl. Acad. Sci. U.S.A. 106, 6111-6116, 2009))dispersed throughout the cellular cytoplasm. The lack of colocalizationof the GFP and mCherry chromophores suggests significant proteolyticseparation of the two proteins and reshuffling of the endosomal vacuolesand their cargo. These results suggest that +36 GFP may be asignificantly more potent protein transduction domain than the widelyused Tat and Arg₉ peptides.

Comparison of mCherry Delivery by +36 GFP, Tat, Arg10, and Penetratin atDifferent Concentrations

+36-GFP, Tat, Arg₁₀, and penetratin-mediated delivery of mCherry werecompared in a variety of mammalian cells using a variety of proteinfusion concentrations (FIG. 44). In HeLa cells, BSR cells (a babyhamster kidney (BHK) cell derivative), inner medullary collecting duct(IMCD) cells, 3T3 murine fibroblast cells, and rat pheochromocytoma PC12cells, increased median mCherry fluorescence intensity was observed with+36GFP as compared to the other delivery vehicles at all concentrations.The cell-penetration potency of +36 GFP exceeded that of knowncell-penetrating peptides and proteins, especially at lowconcentrations.

Ability of +36 GFP Protein Fusions to Access the Cytosol

To probe localization of mCherry after delivery as a fusion with +36GFP, HeLa cells were incubated with 100 nM+36 GFP-mCherry for 4 h andplated the resulting cells together with untreated HeLa cells. As imagedby live-cell widefield fluorescence microscopy, a diffuse redfluorescence was observed from the cytoplasm of cells containing GFPpuncti, while cells lacking GFP puncti did not contain a diffuse redsignal (FIG. 42). This observation suggests that some of theinternalized mCherry protein was separated from GFP, escaped fromendosomes, and was distributed throughout the cytosol. In contrast, theabsence of a significant green fluorescent signal outside punctisuggests that +36 GFP may remain associated with endosomes or withanionic components of former endosomes.

A deubiquitination assay was used to more rigorously evaluate theability of +36 GFP fusions to escape endosomes and access the cytosol.Deubiquitinase (DUB)-dependent removal of a ubiquitin moiety from atranslationally fused protein domain has been previously used as anindicator of exposure to the cytosolic environment in mammalian cells(Loison et al., Mol. Ther. 11, 205-214, 2005). A ubiquitin-+36 GFPfusion in which the C-terminus of ubiquitin was directly followed by theN-terminus of +36 GFP was expressed and purified. A direct fusion ofthis type is known to be recognized and processed by endogenous DUBs,resulting in a ˜9 kDa reduction in molecular weight. A mutant form ofubiquitin (G76V) that is not a substrate for DUBs (Loison et al., Mol.Ther. 11, 205-214, 2005) was similarly fused to +36 GFP to distinguishthe effect of cytosolic DUBs from non-specific proteolysis.

HeLa cells were incubated with either 200 nM ubiquitin-+36 GFP or 200 nMubiquitin G76V+36 GFP, washed with heparin to remove surface-boundproteins, and analyzed by western blot. After a 1-hour incubation, 22%of ubiquitin-+36GFP was deubiquitinated, producing a protein equal insize to +36 GFP (FIG. 41 a). In contrast, the G76V mutant-+36 GFP fusionwas not cleaved, indicating that this reduction in size does not arisefrom non-specific endosomal proteases but instead from the action ofcytosolic DUBs. In the presence of chloroquine, a small molecule knownto disrupt endosomal acidification and enhance release of endocytosedmolecules (Wadia et al., Nat. Med. 10, 310-315, 2004), cleavage ofubiquitin-+36 GFP increased from 22% to 36% (FIG. 37 b). The enhancedcleavage upon addition of chloroquine further supports the assumptionthat cleavage of the ubiquitin-+36 GFP fusion protein reflects itscytosolic exposure. Additionally, ubiquitin-+36 GFP spiked into the celllysis buffer prior to harvesting untreated cells was not cleaved (FIG.37 a), indicating that the observed deubiquitination is a result ofexposure to cytosolic DUBs resulting from cell penetration of +36 GFP,and not due to contact with DUBs during the cell harvesting procedure.

Finally, an in vitro deubiquitination assay was performed using HeLacytosolic extract. Incubation of ubiquitin-+36 GFP, but not G76V mutantubiquitin-+36 GFP, in the cytosolic extract resulted indeubiquitination; in contrast, incubation of either protein in cytosolicextract in the presence of the DUB inhibitor N-ethylmaleimide(Borodovsky et al., EMBO J. 20, 5187-5196, 2001) did not result incleavage, further suggesting that the cleavage of ubiquitin-+36 GFP is aresult of DUB activity (FIG. 37 c). These results demonstrate that asignificant fraction of the ubiquitin-+36 GFP protein fusion can rapidlyenter cells and access the cytosol, rather than remaining entirelylocalized within endosomes.

Comparison of Active Cre Recombinase Delivery by +36 GFP, Tat, and Arg₉

To further explore the ability of +36 GFP to deliver fused proteins infunctional form to specific subcellular locations, the ability of +36GFP, Tat, and Arg₉ to deliver Cre recombinase into a variety ofmammalian cells was compared. Cre has been used as a reporter offunctional protein delivery in a wide range of cell lines (Wadia et al.,Nat. Med. 10, 310-315, 2004). In mammalian cells, Cre must localize tothe nucleus and eventually tetramerize in order to mediate DNArecombination (Quo et al., Nature 389, 40-46, 1997). To assess thepossibility that localization or oligomerization of Cre is impeded by anattached protein transduction domain, susceptibility of the linkerbridging +36 GFP and Cre to cleavage by endogenous proteases wasdetermined. Cathepsin B is a ubiquitous mammalian endosomal proteasethat exhibits broad substrate specificity and efficiently cleaves thepeptide Ala-Leu-Ala-Leu (Trouet et al., Proc. Natl. Acad. Sci. U.S.A.79, 626-629, 1982). This motif was included in the linker joining +36GFP and Cre (FIG. 36 a). When incubated with purified cathepsin B invitro, the +36 GFP-Cre fusion was indeed cleaved into two proteinfragments with lengths consistent with separated +36 GFP and Cre (FIG.38 a).

To evaluate the ability of the +36 GFP-Cre fusion to catalyze DNArecombination before and after proteolytic cleavage of the linker, an invitro recombination assay was performed. Incubation of pCALNL (Matsudaet al., Proc. Natl. Acad. Sci. U.S.A. 104, 1027-1032, 2007), a 6.8 kBcircular plasmid containing a 1.2 kB region flanked by loxP sites, withCre recombinase in vitro leads to excision of the 1.2 kB region (FIG. 38a). Excision of the 1.2 kB region was not observed when pCALNL wasincubated with the intact +36 GFP-Cre fusion. After the fusion proteinwas incubated with cathepsin B, recombinase activity was restored (FIG.38 b). These results indicate that cleavage of the +36 GFP-Cre linker isrequired for efficient Cre recombinase activity. When fused to Tat andto Arg₉, Cre was found to retain recombinase activity (FIG. 43).

The abilities of +36 GFP, Tat, and Arg₉ to deliver functional (nuclearlylocalized and active) Cre to HeLa, NIH-3T3, and murine embryonic stemcells was compared. Following transfection with pCALNL, which alsoserves as a DsRed2-based Cre activity reporter plasmid (Matsuda et al.,Proc. Natl. Acad. Sci. U.S.A. 104, 1027-1032, 2007), HeLa cells wereincubated with 100 nM, 500 nM, or 1 μM of each fusion protein for 1 hourin serum-free media. After incubation, cells were washed with heparinand incubated in full media for 24 hours. Cre recombinase activity wasassayed by expression of DsRed2 via flow cytometry and fluorescencemicroscopy (FIG. 38 c). At all protein concentrations tested, +36GFP-Cre was ˜3- to 7-fold more effective at producing recombinants thanthe corresponding fusions with Tat or Arg₉.

The delivery of active Cre was further evaluated in a NIH-3T3 cell lineharboring an integrated lacZ-based Cre-reporter (Wadia et al., Nat. Med.10, 310-315, 2004). 3T3 cells were incubated with either 100 nM or 1 μMof each fusion protein for 18 hours and then stained with X-Gal toidentify recombinants. Consistent with the HeLa cell results, +36GFP-Cre resulted in 2- to 10-fold more efficient generation ofrecombinants than either Tat or Arg₉ (FIG. 38 d).

Finally, the ability of +36 GFP-Cre, Tat-Cre, and Arg₉-Cre were alsocompared for their ability to enter and catalyze recombination in amurine embryonic stem (mES) cell line containing an integratedmCherry-based Cre activity reporter. Colonies of mES cells were treatedwith 1 μM of each fusion protein in serum-free media for 4 hours, washedthree times with heparin and incubated for 18 hours. Recombination andexpression of mCherry was assayed by flow cytometry and fluorescencemicroscopy (FIG. 38 e). Similar to the results in HeLa and NIH-3T3cells, +36 GFP-Cre generated 2- to 5-fold more recombinants than Tat-Creor Arg₉-Cre (FIG. 38 e). Fluorescence microscopy confirmed that +36 GFPwas able to produce recombinants in intact mES colonies, with multiplerecombinant cells within a given colony. Furthermore, treated mES cellswere able to form recombinant colonies when harvested and replated (FIG.38 e). These data suggest that +36 GFP is a highly potent proteintransduction domain, capable of mediating significantly higher levels offunctional Cre recombinase delivery compared to the widely used Tat andArg₉ peptides.

Comparison of Active Cre Recombinase Delivery by +36 GFP, Tat, and Arg₉

HeLa cells, 3T3 cells, and BSR cells harboring a loxP reporter constructwere treated with +36 GFP-Cre, Arg₁₀-Cre, penetratin-Cre, and Tat-Cre atdifferent concentrations (FIG. 45). In HeLa and 3T3 cells, Cre deliverywas more effective with +36 GFP than with other cell-penetratingpeptides and proteins. In BSR cells, +36 GFP-Cre yielded morerecombinant cells at lower concentrations than other knowncell-penetrating peptides and proteins. Chloroquine greatly enhancedfunctional +36 GFP-mediated Cre delivery in BSR cells, suggesting anendosomal escape bottleneck.

While supercharged protein-mediated cell-penetration potency can be muchgreater (>100-fold) than that of other methods, delivery of functionalsiRNA, DNA, or protein delivery was observed to be only ˜3- to 20-foldmore efficient than with conventional methods. The endosomal escapebottleneck, as indicated by the results described elsewhere herein, islikely limiting the delivery of functional nucleic acids and proteins.One exemplary approach to address the endosomal bottleneck is to combinea supercharged protein or agent to be delivered with an agent thatdisrupts endosomolytic vesicles or enhances the degradation of endosomes(e.g., chloroquine, pyrene butyric acid, fusogenic peptides,polyethyleneimine, hemagglutinin 2 (HA2) peptide, melittin peptide).Peptides and proteins can be fused post-translational, for example, by asortase, if the original peptide/protein contains the respective sortaserecognition sequences. FIG. 54 shows a schematic of a screening assay toidentify peptides that effect efficient endosomal escape of an agent(e.g., Cre recombinase) after delivery to a cell by a superchargedprotein (e.g. +36 GFP). In the example shown, the Cre recombinasecarries a sortase recognition sequence (LPETG, SEQ ID NO: 108). Whencombined with a library of candidate peptides, carrying a second sortaserecognition sequence (GGGG, SEQ ID NO: 109), the sortase-mediatedpeptide-conjugation reaction yields peptide-conjugated superchargedprotein/agent complexes. Peptides effecting efficient endosomal escapecan be identified by incubating these complexes with cells harboring asuitable reporter construct (e.g., loxP-luciferase). Weak reportersignal indicates poor endosomal escape of the respective complex, whilestrong reporter signal indicates efficient endosomal escape.

Sortase enzymes, recognition sequences, and sortase-mediated proteinligation strategies and methods are well known to those of skill in theart (see, e.g., Prat, “Sortase-mediated protein ligation: an emergingbiotechnology tool for protein modification and immobilization”Biotechnol Lett. 2010 January; 32(1):1-10, incorporated herein byreference for disclosure of methods and reagents for sortase-mediatedprotein ligation).

Delivery of Proteins Non-Covalently Associated with Supercharged Protein

To determine, whether supercharged proteins can deliver non-covalentlybound agents, for example, a non-covalently bound small molecule or anon-covalently bound protein; cells were treated with +52 streptavidinnon-covalently bound to biotinylated Alexa594 or biotinylated mCherry ata concentration of 2 μM and 500 nM (FIG. 46). Wild type streptavidin wasused as a control. +52 streptavidin delivered biotinylated smallmolecules and biotinylated protein into cells. These results expand thescope of deliverable molecules to agents, like small molecules andproteins, that cannot covalently bound to or expressed as a fusion witha supercharged protein. As evidenced above, such agents can efficientlybe delivered when bound non-covalently to a supercharged protein.

Discussion

The development of more effective protein delivery methods would expandand enhance opportunities for studying and manipulating biologicalpathways. Our findings establish that superpositively charged GFP, whenfused to a variety of proteins, can deliver proteins quickly andefficiently into a variety of mammalian cells. The cell-penetratingability of +36 GFP tolerates translational fusion to mCherry, ubiquitin,and Cre recombinase. Supercharged GFP can deliver these proteins intomammalian cells at low nanomolar concentrations and in minutes. In aside-by-side comparison across three mammalian cell types, superchargedGFP delivered ˜10- to 100-fold more fused mCherry than either Tat orArg₉, two widely used cell-penetrating peptides. Likewise, products ofCre recombinase activity were observed with greater frequency in HeLacells, mouse 3T3 cells, and mES cells treated with +36 GFP-Cre fusionsthan in the same cells treated with the same concentrations of Tat-Creor Arg₉-Cre fusions.

The delivery of ubiquitin-+36 GFP fusions in a manner that resulted indeubiquitination by cytosolic DUBs indicates that fusion proteinsdelivered in this manner are capable of accessing the cytosol and arenot limited to endosomal localization. Likewise, the ability of +36GFP-Cre fusions to effect recombination in the nucleus furtherestablishes that these fusions can access non-endosomal regions ofmammalian cells when proteolytically labile linkers are used to connect+36 GFP and the protein of interest.

It was previously reported that the capacity of scGFPs to penetratemammalian cells increases as a function of theoretical net charge evenat charges as high as +25 and +36. U.S. Provisional Application Nos61/173,430 and 61/105,287, and PCT Application PCT/US2009/041984,incorporated herein by reference. This property contrasts with peptidicPTDs such as arginine oligomers, which have been observed to losemammalian cell penetration ability when their net theoretical chargeexceeds +15 (Mitchell et al., J. Pept. Res. 56, 318-325, 2000). The cellpenetration potency of +36 GFP may therefore be due in part to chargedistribution over a comparatively large area, which may provide a morestable and extended cationic surface that interacts more effectivelywith mammalian cells. The significantly greater potency of +36 GFPmediated protein delivery compared with that of Tat and Arg₉ may also bea consequence of its structure. Unlike the globular β-barrel of GFP, thenine-residue Tat peptide and Arg₉ peptides are unlikely to bewell-folded, although the former has been observed to adopt a structuresimilar to a poly(proline) II helix (Ruzza et al., J. Pept. Sci. 10,423-426, 2004).

The detailed mechanism by which +36 GFP protein fusions can escapeendosomes before or after proteolytic cleavage from +36 GFP remains tobe determined. One possibility is that the high concentration ofionizable groups in +36 GFP (including 72 basic amino acids,predominantly at surface-exposed positions) buffers endosomes duringacidification, promoting endosome swelling and endosomal leakage. Thismechanism has been previously implicated in the release ofmacromolecules delivered by synthetic polyamines (Boussif et al., Proc.Natl. Acad. Sci. U.S.A. 92, 7297-7301, 1995; Sonawane et al., J. Biol.Chem. 278, 44826-44831, 2003). Protein escape may also result from thestochastic leakage of proteins from endosomes packed with large amountsof the +36 GFP fusion protein. Endosomal integrity may vary according tocell type and may explain differences in functional protein deliveryefficiency. As enhanced endosomal escape has been reported in thepresence of the anionic lipid-like small molecule pyrene butyrate(Takeuchi et al., ACS Chem. Bio. 1, 299-303, 2006), it is also possiblethat anionic lipids from E. coli may co-purify with a +36 GFP fusion andenhance endosomal escape.

Although +36 GFP is considerably larger than Tat or Arg₉ (29 kDa vs˜1kDa), +36 GFP can be fused to proteins of interest via a proteolyticallylabile linker so that the proteins can exist intracellularly in arelatively unmodified form. Cleavage of such a linker decreased GFPfluorescence quenching and FRET in the case of the +36 GFP-mCherryfusion, and restored Cre recombinase activity in vitro

Methods Design, Expression, and Purification of Protein Fusions.

Protein fusions involving +36 GFP were constructed as: +36GFP-(GGS)₄-ALAL-(GGS)₄-mCherry-His₆; +36 GFP-(GGS)₄-ALAL(GGS)₄-Cre-His₆;and His₆-Ubiquitin-+36 GFP. Tat fusions were constructed as Tat-T7tag-(protein of interest)-His₆ (Wadia et al., Nat. Med. 10, 310-315,2004). Arg₉ fusions were constructed similar to previously usedpolyarginine-tagged proteins⁷, in the form His₆-(protein ofinterest)-(GGGS)₂-Arg₉. Complete protein sequences are listed elsewhereherein. Genes encoding each fusion were cloned into a pET vector andtransformed into BL21(DE3) E. coli. Cells were grown in 1 L LB culturesat 37° C. to 0D₆₀₀=˜0.6 and induced with 1 mM IPTG at 30° C. for 4 h.Cells were harvested by centrifugation, resuspended in 40 mL PBS+2MNaCl, and lysed by sonication.

The lysate was cleared by centrifugation (10,000 G, 8 min) and thesupernatant was mixed with 1 mL of Ni-NTA agarose resin (Qiagen) for 30minutes at 4° C. on a rotating drum. The resin was recovered bycentrifugation (10,000 G, 8 min), resuspended in 20 mL PBS+2 M NaCl, andpacked into a 5 mL syringe containing a glass wool plug. The resin waswashed with 15 mL of PBS containing 2 M NaCl and 20 mM imidazole. Theprotein fusion was eluted with 3 mL PBS containing 2 M NaCl and 500 mMimidazole. The eluate was immediately dialyzed against PBS+1 M NaCl at4° C. for one hour. All fusions except for the +36 GFP-Cre protein weredialyzed against PBS at 4° C. overnight; the +36 GFP-Cre fusion wasdialyzed against PBS+500 mM NaCl to minimize precipitation. Tat Cre andArg₉ Cre fusions were stored at −20° C. in 20% glycerol. Purified GFPswere centrifuged after dialysis to remove any precipitated protein orcontaminants and quantitated by absorbance at 488 nm assuming anextinction coefficient of 8.33×10⁴M⁻¹cm⁻¹ (Pedelacq et al., Nat.Biotechnol. 24, 79-88, 2006). Purified mCherry fusions were quantifiedby absorbance at 587 nm assuming an extinction coefficient of 7.2×10⁴M⁻¹cm⁻¹ (Shaner et al., Nat. Biotechnol. 22, 1567-1572, 2004). Tat-Creand Arg⁻ Cre were quantified using a Modified Lowry Protein Assay Kit(Pierce). Proteins were evaluated by SDS-PAGE analysis (FIG. 39).

Cell Culture

HeLa, IMCD and PC12 cells were cultured in Dulbecco's modification ofEagle's medium (DMEM, Sigma) with 10% fetal bovine serum (FBS, Sigma), 2mM glutamine, 5 I.U. penicillin, and 5 μg/mL streptamycin. All cellswere cultured at 37° C. with 5% CO₂. PC12 cells were purchased fromATCC.

Fixed-Cell Imaging

Cells were plated directly onto glass cover slips in a six-well tissueculture plate at a density of 10⁶ cells per well. After 12 h, cells werewashed once with cold PBS and incubated with protein in serum-free DMEM.Cells were washed three times with 20 U/mL heparin PBS to removemembrane-bound protein, fixed in 4% formaldehyde in PBS, stained withDAPI, and imaged with an Olympus IX71 spinning disk confocal microscope.GFP and mCherry were visualized by confocal laser microscopy with a 491nm and 561 nm excitation laser, respectively. DAPI stain was imaged bywidefield fluorescence. Images were prepared using OpenLab software(Improvision).

Cathepsin B-Mediated Linker Cleavage.

The +36 GFP mCherry and +36 GFP Cre fusion was cleaved by incubating 30pmol of protein in 10 μL of 20 mM MES, pH 6.5 with 500 ng of cathepsin Bfrom human liver (Sigma) at 37° C. for 45 mm. A μL aliquot of the +36GFP Cre fusion cleavage reaction was used for the Cre recombinase invitro assay; the remaining 9 μL were used for Western blot analysis.Anti-GFP (1/10,000 dilution, ab290) and anti Cre (1/2,000 dilution,ab24607) primary antibodies were purchased from Abcam. Anti-mCherryprimary antibody (1/2,000 dilution, 632393) was purchased from Clontech.Western blots were performed as described above.

Deubiquitination Assay.

HeLa cells were seeded in a 24-well tissue culture plate at a density of100,000 cells per well. After 12 h, cells were washed once with PBS andincubated either with ubiquitin-+36 GFP or with ubiquitin(G76V)-+36 GFPat 500 nM in serum free DMEM for 1 hour at 37° C. Cells were washedthree times with 20 U/mL heparin PBS to remove cell surface-boundprotein. Cells were incubated with 250 μL of ice-cold PBS, allowed todetach from the plate, lysed by adding 250 μL of denaturing LDS SampleBuffer (Invitrogen) to the well, transferred to a microcentrifuge tube,heated at 95° C. for 10 minutes, and loaded on a 12% SDSPAGE gel.Alternatively, untreated cells were washed and lysed as described with100 pmoles of ubiquitin-+36 GFP or ubiquitin(G76V)-+36 GFP spiked intothe denaturing LDS Sample Buffer. Crude HeLa cytosolic extract wasprepared by harvesting cells with a cell scraper and lysing innon-denaturing 0.5% Triton X-100 containing 1.7 μg/mL aprotinin, 10μg/mL leupeptin, and 1 mM PMSF. The lysate was cleared by centrifugation(13,000 G, 10 minutes) to yield the cytosolic fraction. 50 fmol ofeither ubiquitin-+36 GFP or ubiquitin(G76V)-+36 GFP was added to 250 μLof lysate and incubated at 37° C. for 30 minutes either with or withoutthe addition of 10 mM N-ethylmaleimide.

Cre Recombinase Cellular Assay.

Cre fusions were assayed for activity in HeLa cells using pCALNL-DsRed2.HeLa cells were transiently transfected with pCALNL-DsRed2 usingEffectene (Qiagen) using the manufacturer's protocol. 24 h aftertransfection, purified Cre fusion proteins were added into serum-freeDMEM, incubated at room temperature for 5 minutes and applied to cells.Cells were incubated for 4 h at 37° C. and washed three times with 20U/mL heparin PBS to remove membrane-bound protein. 24 h after proteintreatment, cells were assayed for recombination by flow cytometry andlive-cell microscopy.

Cre fusions were assayed for in vivo activity in 3T3 cells containing anintegrated β-galactosidase-based Cre reporter (S. Dowdy, UCSD).3T3.loxP.lacZ cells were treated with Cre protein fusions in completemedium. 24 h after treatment, cells were fixed in 4% formaldehyde in PBSand stained for β-galactosidase activity using the Promega In Situβ-Galactosidase Staining Kit. The number of recombinants was quantitatedby counting X-gal-stained cells. Cre fusions were assayed for activityin mouse embryonic stem (mES) cells using an integrated foxed mCherryreporter (D. Melton, Harvard University). mES cells were harvested bytrypsinization and MEF depletion over gelatinized plates. TheMEF-depleted mES cells were seeded in gelatinized 12-well plates at200,000 cells per well. 24 h after seeding, mES colonies were treatedwith purified Cre fusion proteins for 4 h in serum-free medium. Cellswere washed in heparin PBS, and incubated in full mES culture media forand additional 24 h, harvested by trypsinization, and analyzed by flowcytometry.

Deubiquitination Assay Western Blot

Samples were analyzed on a 12% SDS PAGE (Invitrogen) gel and transferredby electroblotting onto a PVDF membrane (Millipore) pre-soaked inmethanol. Membranes were blocked in 5% milk for 1 h, and incubated inprimary antibody in 3% BSA for 30 minutes at room temperature. Anti-GFP(1/10,000 dilution, ab290) and anti-His₆ (1/2,500 dilution, ab18184)primary antibodies were purchased from Abeam. The membrane was washedthree times with PBS and treated with the secondary antibodies, IRDye800CW Goat Anti-Mouse IgG (1/10,000 dilution, Li-COR Biosciences) andIRDye 680 Goat Anti-Rabbit IgG (1/10,000 dilution, Li-COR Biosciences),in blocking buffer (Li-COR Biosciences) for 30 minutes. The membrane waswashed three times with 50 mM Tris, pH 7.4 containing 150 mM NaCl and0.05% Tween-20 and visualized using an Odyssey infrared imaging system(Li-COR Biosciences). Images were analyzed using Odyssey imagingsoftware version 2.0.

Cathepsin B-Mediated Linker Cleavage

The +36 GFP mCherry and +36 GFP Cre fusion was cleaved by incubating 30μmol of protein in 10 μL of 20 mM MES, pH 6.5 with 500 ng of cathepsin Bfrom human liver (Sigma) at 37° C. for 45 min. A 1 μL aliquot of the +36GFP Cre fusion cleavage reaction was used for the Cre recombinase invitro assay; the remaining 9 μL were used for Western blot analysis.

Anti-GFP (1/10,000 dilution, ab290) and anti Cre (1/2,000 dilution,ab24607) primary antibodies were purchased from Abeam. Anti-mCherryprimary antibody (1/2,000 dilution, 632393) was purchased from Clontech.Western blots were performed as described above.

Protein Sequences

stGFP: (SEQ ID NO: 110)MGHHHHHHGGASKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK

Examples of Supercharged GFPs and Fusion Proteins:

+36 GFP: (SEQ ID NO: 111)MGHHHHHHGGASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYK +36 GFP-mCherry:(SEQ ID NO: 112) MASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYKGGSGGSGGSGGSALALGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDEL YKLEHHHHHHH₃₉ GFP (His39 GFP) (SEQ ID NO: 133)MGASKGEHLFHGHVPILVELHGDVNGHKFSVRGHGHGDATHGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPHHMKHHDFFKSAMPHGYVQERTISFKHDGHYKTRAEVKFEGHTLVNRIHLKGHDFKEHGNILGHKLHYNFNSHHVYITADKHKNGIKAHFKIRHNVHDGSVQLADHYQQNTPIGHGPVLLPHNHYLSTHSHLSKDPHEKRDHMVLLEFVTAAGIHHGHDEHYK Ubiquitin- +36 GFP:(SEQ ID NO: 113) MGHHHHHHGGMQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGGASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYK Ubiquitin G76V- +36 GFP: (SEQ ID NO: 114)MGHHHHHHGGMQIFVKTLIGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGVASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYK +36 GFP-Cre: (SEQ ID NO: 115)MASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYKGGSGGSGGSGGSALALGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSET GAMVRLLEDGDHHHHHH+36 GFP-(GGS)4-ALAL-(GGS)4-Cre: (SEQ ID NO: 147)MASKGERLFRGKVPILVELKGDVNGHKFSVRGKGKGDATRGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPKHMKRHDFFKSAMPKGYVQERTISFKKDGKYKTRAEVKFEGRTLVNRIKLKGRDFKEKGNILGHKLRYNFNSHKVYITADKRKNGIKAKFKIRHNVKDGSVQLADHYQQNTPIGRGPVLLPRNHYLSTRSKLSKDPKEKRDHMVLLEFVTAAGIKHGRDERYKGGSGGSGGSGGSALALGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSET GAMVRLLEDGDHHHHHH

PTD Fusion Proteins:

Tat-stGFP: (SEQ ID NO: 116)MGRKKRRQRRRGHMASMTGGQQMGRDPASKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYKAAALEHHHHHH Tat-mCherry: (SEQ ID NO: 117)MGRKKRRQRRRGHMASMTGGQQMGRDPNSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKITYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKARGAAALEHHHHHH mCherry-Arg9: (SEQ ID NO: 118)MGHHHHHHGGASKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKARGGG SGGGSRRRRRRRRRArg10-mCherry: (SEQ ID NO: 142)MRRRRRRRRRRGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHHHH penetratin-mCherry: (SEQ ID NO: 143)MRQIKIWFQNRRMKWKKGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHHHH Tat-Cre: (SEQ ID NO: 119)MGRKKRRQRRRGHMASMTGGQQMGRDPNSMSNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDAAALEHHHHHH Cre-Arg9: (SEQ ID NO: 120)MGHHHHHHGGASMSNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDRGGGSGGGSRRRRRRRRR Tat-Cre (FIG. 52 and 53): (SEQ ID NO: 144)MGRKKRRQRRRGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHH Arg10-Cre: (SEQ ID NO: 145)MRRRRRRRRRRGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHH penetratin-Cre: (SEQ ID NO: 146)MRQIKIWFQNRRMKWKKGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHH Tat-T7 tag-mCherry:(SEQ ID NO: 148) MGRKKRRQRRRGHMASMTGGQQMGRDPNSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKITYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEYERAEGRHSTGGMDELYKARGAAALEHHHHHH Tat-T7 tag-Cre: (SEQ ID NO: 149)MGRKKRRQRRRGHMASMTGGQQMGRDPNSMSNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDAAALEHHHHHH

Examples of Naturally Occurring Superpositively Charged Human Proteinsand Fusion Proteins:

HRX (UNIPROT: Q03164 PDB: 2J2S) HRX-(GGS)₉-mCherry-His₆ (SEQ ID NO: 121)MVKKGRRSRRCGQCPGCQVPEDCGVCTNCLDKPKFGGRNIKKQCCKMRKCQNLQWMPSKAYLQKQAKAVKGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHHHH HRX-(GGS)₉-Cre-His₆(SEQ ID NO: 122) MVKKGRRSRRCGQCPGCQVPEDCGVCINCLDKPKFGGRNIKKQCCKMRKCQNLQWMPSKAYLQKQAKAVKGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHHC-JUN (UNIPROT: P05412 PDB: 1JNM) C-JUN -(GGS)₉-mCherry-His₆(SEQ ID NO: 123) MKAERKRMRNRIAASKSRKRKLERIARLEEKVKTLKAQNSELASTANMLREQVAQLKQKVMNHGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHHHH C-JUN -(GGS)₉-Cre-His₆(SEQ ID NO: 124) MKAERKRMRNRIAASKSRKRKLERIARLEEKVKTLKAQNSELASTANMLREQVAQLKQKVMNHGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHHDEFENSIN 3 (UNIPROT: P81534 PDB: 1KJ6) DEFENSIN 3 -(GGS)₉-mCherry-His₆(SEQ ID NO: 125) MGIINTLQKYYCRVRGGRCAVLSCLPKEEQIGKCSTRGRKCCRRKKGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHST GGMDELYKLEHHHHHHDEFENSIN 3 -(GGS)₉-Cre-His₆ (SEQ ID NO: 126)MGIINTLQKYYCRVRGGRCAVLSCLPKEEQIGKCSTRGRKCCRRKKGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHH HBEGF (UNIPROT: Q99075 PDB: 1XDT)HBEGF-(GGS)₉-mCherry-His₆ (SEQ ID NO: 127)MRVTLSSKPQALATPNKEEHGKRKKKGKGLGKKRDPCLRKYKDFCIHGECKYVKELRAPSCICHPGYHGERCHGLSGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHHHH HBEGF -(GGS)₉-Cre-His₆(SEQ ID NO: 128) MRVTLSSKPQALATPNKEEHGKRKKKGKGLGKKRDPCLRKYKDFCIHGECKYVKELRAPSCICHPGYHGERCHGLSGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHH HHHHN-DEK (UNIPROT: P35659 PDB: 2JX3) N-DEK-(GGS)₉-mCherry-His₆(SEQ ID NO: 129) MFTIAQGKGQKLCEIERIHFFLSKKKTDELRNLHKLLYNRPGTVSSLKKNVGQFSGFPFEKGSVQYKKKEEMLKKFRNAMLKSICEVLDLERSGVNSELVKRILNFLMHPKPSGKPLPKSKKTCSKGSKKERGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHH HHN-DEK -(GGS)₉-Cre-His₆ (SEQ ID NO: 130)MFTIAQGKGQKLCEIERIHFFLSKKKTDELRNLHKLLYNRPGTVSSLKKNVGQFSGFPFEKGSVQYKKKEEMLKKFRNAMLKSICEVLDLERSGVNSELVKRILNFLMHPKPSGKPLPKSKKTCSKGSKKERGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLE DGDHHHHHHHGF (UNIPROT: P14210 PDB: 2HGF) HGF-(GGS)₉-mCherry-His₆ (SEQ ID NO: 131)MGQRKRRNTIHEFICKSAKTTLIKIDPALKIKTKKVNTADQCANRCTRNKGLPFTCKAFVFDKARKQCLWFPFNSMSSGVKKEFGHEFDLYENKDYIRNGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGR HSTGGMDELYKLEHHHHHHHGF -(GGS)₉-Cre-His₆ (SEQ ID NO: 132)MGQRKRRNTIHEFKKSAKTTLIKIDPALKIKTKKVNTADQCANRCTRNKGLPFTCKAFVFDKARKQCLWFPFNSMSSGVKKEFGHEFDLYENKDYIRNGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHH HIST4 (UNIPROT: P62805 PDB: 2CV5)HIST4 -(GGS)₉-mCherry-His₆ (SEQ ID NO: 150)MSGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGGGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHHHH HIST4 -(GGS)₉-Cre-His₆ (SEQ ID NO: 151)MSGRGKGGKGLGKGGAKRHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGGGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHH EOTAXIN 3 (UNIPROT: Q9Y258 PDE: 1G2S)EOTAXIN 3-(GGS)₉-mCherry-His₆ (SEQ ID NO: 155)MTRGSDISKTCCFQYSHKPLPWTWVRSYEFTSNSCSQRAVIFTTKRGKKVCTHPRKKWVQKYISLLKTPKQLGGSGGSGGSGGSGGSGGSGGSGGSGGSVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKIDITSHNEDYTIVEQYERAEGRHSTGGMDELYKLEHHHHHH EOTAXIN 3-(GGS)₉-Cre-His₆(SEQ ID NO: 152) MTRGSDISKTCCFQYSHKPLPWTWVRSYEFTSNSCSQRAVIFTTKRGKKVCTHPRKKWVQKYISLLKTPKQLGGSGGSGGSGGSGGSGGSGGSGGSGGSMASNLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEPEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGDHHHHHH

Example 8 Supercharged Protein-Mediated In Vivo Delivery

+36 GFP and +36 GFP:Cy3-siRNA was introduced into mice by tail-veininjection. Both +36 GFP and +36 GFP:Cy3-siRNA were localized to livercells 1 h post-injection and detectable by western blot in liver tissue16 h post-injection (FIG. 51).

The ability of +36 GFP to act as a protein delivery agent in vivo wastested. First, the tissue penetration of +36 GFP in the adult mouseretina was examined. 0.5 μL of 100 μM+36 GFP were injected into thesubretinal space of CD1 adult mice. After 6 hours, the retinas wereharvested and analyzed by fluorescence microscopy (FIG. 52). Most of +36GFP was observed by the photoreceptor outer segments, but significantsignal was observed throughout the retina, including all three nuclearlayers (the outer, inner, and ganglion cell layers) as well as in thecell processes.

To test ability of +36 GFP to deliver functional protein in vivo, +36GFP-Cre was injected into the subretinal space of RC::PFwe mouse p0 pupscontaining a LoxP-flanked transcriptional terminator upstream of anuclear lacZ reporter gene. 20 Three days after injection of 0.5 μL of40 μM wild-type Cre, Tat-Cre, or +36 GFPCre, retinae were harvested,fixed, and stained with X-gal (FIG. 52). A comparison of loopoutefficiencies of wild type Cre, Tat-Cre and +36 GFP-Cre by ex vivo X-galstaining of p0 pup retinas harboring a nuclear LacZ reporter 72 hpost-treatment showed more efficient recombination after +36 GFP-Cretreatment, suggesting more efficient Cre-delivery to the retina by +36GFP than by Tat. Similarly, +36 GFP-Cre effects recombination in vivo inmurine p0 pups harboring a nuclear LacZ Cre reporter (FIG. 53).Consistent with the findings ex vivo, the in vivo recombination potencyin this setting is higher for +36 GFP-Cre than that of Tat-Cre.Injection of +36 GFP-Cre generated an average of 715 recombined cellsper injected retina (n=6), Tat-Cre generated an average of 318recombined cells (n=6) while wild-type Cre generated an average of 117recombined cells per retina (n=4) (FIG. 53). To the inventors knowledge,this is the first report of functional delivery of an enzyme intoretinal cells in vivo.

EQUIVALENTS AND SCOPE

Those skilled in the art will recognize, or be able to ascertain usingno more than routine experimentation, many equivalents to the specificembodiments, described herein. The scope of the present invention is notintended to be limited to the above Description, but rather is as setforth in the appended claims.

Those skilled in the art will recognize, or be able to ascertain usingno more than routine experimentation, many equivalents to the specificembodiments in accordance with the invention described herein. The scopeof the present invention is not intended to be limited to the aboveDescription, but rather is as set forth in the appended claims.

In the claims articles such as “a,” “an,” and “the” may mean one or morethan one unless indicated to the contrary or otherwise evident from thecontext. Claims or descriptions that include “or” between one or moremembers of a group are considered satisfied if one, more than one, orall of the group members are present in, employed in, or otherwiserelevant to a given product or process unless indicated to the contraryor otherwise evident from the context. The invention includesembodiments in which exactly one member of the group is present in,employed in, or otherwise relevant to a given product or process. Theinvention includes embodiments in which more than one, or all of thegroup members are present in, employed in, or otherwise relevant to agiven product or process. Furthermore, it is to be understood that theinvention encompasses all variations, combinations, and permutations inwhich one or more limitations, elements, clauses, descriptive terms,etc., from one or more of the listed claims is introduced into anotherclaim. For example, any claim that is dependent on another claim can bemodified to include one or more limitations found in any other claimthat is dependent on the same base claim. Furthermore, where the claimsrecite a composition, it is to be understood that methods of using thecomposition for any of the purposes disclosed herein are included, andmethods of making the composition according to any of the methods ofmaking disclosed herein or other methods known in the art are included,unless otherwise indicated or unless it would be evident to one ofordinary skill in the art that a contradiction or inconsistency wouldarise.

Where elements are presented as lists, e.g., in Markush group format, itis to be understood that each subgroup of the elements is alsodisclosed, and any element(s) can be removed from the group. It shouldit be understood that, in general, where the invention, or aspects ofthe invention, is/are referred to as comprising particular elements,features, etc., certain embodiments of the invention or aspects of theinvention consist, or consist essentially of, such elements, features,etc. For purposes of simplicity those embodiments have not beenspecifically set forth in haec verba herein. It is also noted that theterm “comprising” is intended to be open and permits the inclusion ofadditional elements or steps.

Where ranges are given, endpoints are included. Furthermore, it is to beunderstood that unless otherwise indicated or otherwise evident from thecontext and understanding of one of ordinary skill in the art, valuesthat are expressed as ranges can assume any specific value or subrangewithin the stated ranges in different embodiments of the invention, tothe tenth of the unit of the lower limit of the range, unless thecontext clearly dictates otherwise.

In addition, it is to be understood that any particular embodiment ofthe present invention that falls within the prior art may be explicitlyexcluded from any one or more of the claims. Since such embodiments aredeemed to be known to one of ordinary skill in the art, they may beexcluded even if the exclusion is not set forth explicitly herein. Anyparticular embodiment of the compositions of the invention (e.g., anysupercharged protein; any nucleic acid; any method of production; anymethod of use; etc.) can be excluded from any one or more claims, forany reason, whether or not related to the existence of prior art.

All cited sources, for example, references, publications, databases,database entries, and art cited herein, are incorporated into thisapplication by reference, even if not expressly stated in the citation.In case of conflicting statements of a cited source and the instantapplication, the statement in the instant application shall control.

1.-91. (canceled)
 92. A method of evaluating a supercharged protein forcell penetration comprising: optionally selecting a superchargedprotein; providing said supercharged protein; contacting saidsupercharged protein with a cell; and determining if the superchargedprotein penetrates the cell, thereby evaluating a supercharged proteinfor cell penetration.
 93. (canceled)
 94. A supercharged proteinassociated with a functional peptide or protein, comprising asupercharged protein, and a functional peptide or protein, wherein thesupercharged protein is covalently bound to the functional peptide orprotein, and wherein the supercharged protein associated with thefunctional peptide or protein is able to penetrate a cell and deliverthe functional peptide or protein to the cell.
 95. (canceled)
 96. Thesupercharged protein associated with a functional peptide or protein ofclaim 94, wherein the supercharged protein is bound to the functionalprotein or peptide via a peptide bond, thus forming a fusion protein.97. The supercharged protein associated with a functional peptide orprotein of claim 94, wherein the supercharged protein and the functionalprotein or peptide are bound to a linker connecting the superchargedprotein and the functional peptide or protein.
 98. (canceled)
 99. Thesupercharged protein associated with a functional peptide or protein ofclaim 94, wherein the supercharged protein or the linker can be cleavedby a cellular enzyme. 100.-103. (canceled)
 104. The supercharged proteinassociated with a functional peptide or protein of claim 97, wherein thesupercharged protein or the linker comprises an amino acid sequencechosen from the group including X-AGVF-X (SEQ ID NO: 136), X-GFLG-X (SEQID NO: 137), X-FK-X (SEQ ID NO: 138), X-AL-X (SEQ ID NO: 139), X-ALAL-X(SEQ ID NO: 140), or X-ALALA-X (SEQ ID NO: 141), wherein X denotes arest comprising the supercharged protein or the functional peptide orprotein.
 105. The supercharged protein associated with a functionalpeptide or protein of claim 94, wherein the supercharged protein is aglobular protein with a surface charge density or surface chargedistribution similar to that of a supercharged GFP.
 106. Thesupercharged protein associated with a functional peptide or protein ofclaim 94, wherein the supercharged protein is a protein comprising aβ-barrel.
 107. The supercharged protein associated with a functionalpeptide or protein of claim 94, wherein the supercharged protein is asupercharged GFP.
 108. (canceled)
 109. The supercharged proteinassociated with a functional peptide or protein of claim 94, wherein thefunctional protein is a protein chosen from the group including: anenzyme, a DNA-binding protein, a histone, a cytoskeletal protein, areceptor protein, a caperone protein, a histone acetyltransferase, ahistone deacetylase, a DNA methyltransferase, a kinase, a phosphatase, aprotease, an oxidoreductase, a transferase, a hydrolase, a lyase, anisomerase, a ligase, a transcription factor, a tumor suppressor, adevelopmental regulator, a growth factor, a metastasis suppressor, apro-apoptotic protein, a nuclease, a zinc finger nuclease, and arecombinase.
 110. The supercharged protein associated with a functionalpeptide or protein of claim 109, wherein the functional protein is aprotein chosen from the group including: p53, Rb (retinoblastomaprotein), BRCA1, BRCA2, PTEN, APC, CD95, ST7, ST14, a BCL-2 familyprotein, a caspase; BRMS1, CRSP3, DRG1, KAI1, KISS1, NM23, a TIMP-familyprotein, a BMP-family growth factor, EGF, EPO, FGF, G-CSF, GM-CSF, aGDF-family growth factor, HGF, HDGF, IGF, PDGF, TPO, TGF-α, TGF-β, VEGF;a zinc finger nuclease targeting a site within the human CCR5 gene, Cre,Dre, and FLP recombinase.
 111. (canceled)
 112. A method of delivering afunctional peptide or protein to a cell, comprising: contacting the cellwith a supercharged protein associated with the functional peptide orprotein, under conditions sufficient for the functional peptide orprotein to enter the cell. 113-114. (canceled)
 115. The method of claim112, wherein the peptide or protein is a nuclear peptide or protein andthe contacting results in delivery of the protein to the nucleus of thecell.
 116. The method of claim 112, wherein the protein delivered to thecell is a transcription factor and/or a reprogramming factor. 117-118.(canceled)
 119. The method of claim 116, wherein the cell is contactedwith a supercharged protein associated with a reprogramming factor in anamount, for a time, and under conditions sufficient to inducereprogramming of the cell to a pluripotent state.
 120. The method ofclaim 119, further comprising: isolating a pluripotent cell generatedfrom the somatic cell; differentiating the isolated pluripotent cell, orprogeny thereof, into a differentiated cell type; and/or using thepluripotent cell, or differentiated progeny thereof, in a cellreplacement therapeutic approach. 121-122. (canceled)
 123. The method ofclaim 112, wherein the cell is a cell carrying genomic allele associatedwith a disease and the supercharged protein is associated with anuclease specifically targeting the allele. 124-126. (canceled)
 127. Themethod of claim 123, wherein the nuclease targets the human CCR5 gene ina T-lymphocyte of a subject diagnosed with HIV/AIDS.
 128. (canceled)129. The method of claim 112, wherein the protein is a recombinase andthe cell comprises a recombination site recognized by the recombinase inits genome.
 130. The method of claim 129, wherein the cell comprises aplurality of recombination sites recognized by the recombinase andrecombinase-mediated recombination of the plurality of recombinationsites results in deletion of a genomic region.
 131. The method of claim112, wherein the cell is a tumor cell and the protein is a tumorsuppressor protein, a metastasis suppressor protein, a cytostatic or acytotoxic protein.
 132. The supercharged protein associated with afunctional peptide or protein of claim 94, wherein the superchargedprotein is selected from the group consisting of cyclon (ID No.:Q9H6F5), PNRC1 (ID No.: Q12796), RNPS1 (ID No.: Q15287), SURF6 (ID No.:O75683), AR6P (ID No.: Q66PJ3), NKAP (ID No.: Q8N5F7), EBP2 (ID No.:Q99848), LSM11 (ID No.: P83369), RL4 (ID No.: P36578), KRR1 (ID No.:Q13601), RY-1 (ID No.: Q8WVK2), BriX (ID No.: Q8TDN6), MNDA (ID No.:P41218), H1b (ID No.: P16401), cyclin (ID No.: Q9UK58), MDK (ID No.:P21741), Midkine (ID No.: P21741), PROK (ID No.: Q9HC23), FGF5 (ID No.:P12034), SFRS (ID No.: Q8N9Q2), AKIP (ID No.: Q9NWT8), CDK (ID No.:Q8N726), beta-defensin (ID No.: P81534), Defensin 3 (ID No.: P81534);PAVAC (ID No.: P18509), PACAP (ID No.: P18509), eotaxin-3 (ID No.:Q9Y258), histone H2A (ID No.: Q7L7L0), HMGB1 (ID No.: P09429), C-Jun (IDNo.: P05412), TERF 1 (ID No.: P54274), N-DEK (ID No.: P35659), PIAS 1(ID No.: O75925), Ku70 (ID No.: P12956), HBEGF (ID No.: Q99075), HGF (IDNo.: P14210), HRX (ID No.: Q03164), histone 4 (ID No.: P62805), U4/U6.U5tri-snRNP-associated protein 3 (ID No.: Q8WVK2), beta-defensin (ID No.:P81534), Protein SFRS121P1 (ID No.: Q8N9Q2), midkine (ID No.: P21741),C-C motif chemokine 26 (ID No.: Q9Y258), surfeit locus protein 6 (IDNo.: O75683), Aurora kinase A-interacting protein (ID No.: Q9NWT8),NF-kappa-B-activating protein (ID No.: Q8N5F7), histone H1.5 (ID No.:P16401), histone H2A type 3 (ID No.: Q7L7L0), 60S ribosomal protein L4(ID No.: P36578), isoform 1 of RNA-binding protein with serine-richdomain 1 (ID No.: Q15287-1), isoform 4 of cyclin-dependent kinaseinhibitor 2A (ID No.: Q8N726-1), isoform 1 of prokineticin-2 (ID No.:Q9HC23-1), isoform 1 of ADP-ribosylation factor-like protein6-interacting protein 4 (ID No.: Q66PJ3-1), isoform long of fibroblastgrowth factor 5 (ID No.: P12034-1), and isoform 1 of cyclin-L1 (ID No.:Q9UK58-1).
 133. The supercharged protein associated with a functionalpeptide or protein of claim 94, wherein the supercharged protein is anaturally-occurring supercharged protein.